Sign Up
Log In
Log In
or
Sign Up
Places
All Projects
Status Monitor
Collapse sidebar
openSUSE:Backports:SLE-15-SP4:RebuildFactoryUpdates
bom
bom-1.0.1.obscpio
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
File bom-1.0.1.obscpio of Package bom
07070100000000000081A4000003E8000000640000000161830F5E00000029000000000000000000000000000000000000001200000000bom-1.0.1/AUTHORSArchie L. Cobbs <archie.cobbs@gmail.com> 07070100000001000081A4000003E8000000640000000161830F5E000000B0000000000000000000000000000000000000001200000000bom-1.0.1/CHANGESVersion 1.0.1 Released November 3, 2021 - Fixed bug when multi-byte sequence crossed input buffer boundary Version 1.0.0 Released October 16, 2021 - Initial release 07070100000002000081A4000003E8000000640000000161830F5E00000052000000000000000000000000000000000000001200000000bom-1.0.1/INSTALLSimplified instructions: 1. ./configure 2. make 3. sudo make install 07070100000003000081A4000003E8000000640000000161830F5E00002C5D000000000000000000000000000000000000001200000000bom-1.0.1/LICENSE Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. 07070100000004000081A4000003E8000000640000000161830F5E000003F6000000000000000000000000000000000000001600000000bom-1.0.1/Makefile.am# # bom - Deals with Unicode byte order marks # # Copyright (C) 2021 Archie L. Cobbs. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # bin_PROGRAMS= bom man_MANS= bom.1 docdir= $(datadir)/doc/packages/$(PACKAGE) doc_DATA= CHANGES LICENSE README.md INSTALL AUTHORS EXTRA_DIST= CHANGES LICENSE README.md bom_SOURCES= main.c \ gitrev.c .PHONY: tests tests: bom cd tests && ./run.sh gitrev.c: printf 'const char *const bom_version = "%s";\n' "`git describe`" > gitrev.c 07070100000005000081A4000003E8000000640000000161830F5E0000142A000000000000000000000000000000000000001400000000bom-1.0.1/README.md**bom** is a simple UNIX command line utility for dealing with Unicode byte order marks (BOM's). Unicode byte order marks are "magic number" byte sequences that sometimes appear at the beginning of a file to indicate the file's character encoding. They're sometimes helpful but usually they're just annoying. You can read more about byte order marks [here](https://en.wikipedia.org/wiki/Byte_order_mark). **bom** operates in one of the following modes: * `bom --detect` Detect which type of byte order mark is present (if any) and print to standard output * `bom --strip` Strip off the byte order mark (if any) and output the remainder of the file, optionally also converting to UTF-8 * `bom --print` Output the byte sequence corresponding to a byte order mark (useful for adding them to files) * `bom --list` List the supported byte order mark types Here is the man page: ``` BOM(1) BSD General Commands Manual BOM(1) NAME bom -- Decode Unicode byte order mark SYNOPSIS bom --strip [--expect types] [--lenient] [--prefer32] [--utf8] [file] bom --detect [--expect types] [--prefer32] [file] bom --print type bom --list bom --help bom --version DESCRIPTION bom decodes, verifies, reports, and/or strips the byte order mark (BOM) at the start of the specified file, if any. When no file is specified, or when file is -, read standard input. OPTIONS -d, --detect Report the detected BOM type to standard output and then exit. See SUPPORTED BOM TYPES for possible values. -e, --expect types Expect to find one of the specified BOM types, otherwise exit with an error. Multiple types may be specified, separated by commas. Specifying NONE is acceptable and matches when the file has no (sup- ported) BOM. -h, --help Output command line usage help. -l, --lenient Silently ignore any illegal byte sequences encountered when converting the remainder of the file to UTF-8. Without this flag, bom will exit immediately with an error if an ille- gal byte sequence is encountered. This flag has no effect unless the --utf8 flag is given. --list List the supported BOM types and exit. -p, --print type Output the byte sequence corresponding to the type byte order mark. --prefer32 Used to disambiguate the byte sequence FF FE 00 00, which can be either a UTF-32LE BOM or a UTF-16LE BOM followed by a NUL character. Without this flag, UTF-16LE is assumed; with this flag, UTF-32LE is assumed. -s, --strip Strip the BOM, if any, from the beginning of the file and output the remainder of the file. -u, --utf8 Convert the remainder of the file to UTF-8, assuming the character encoding implied by the detected BOM. For files with no (supported) BOM, this flag has no effect and the remainder of the file is copied unmodified. For files with a UTF-8 BOM, the identity transformation is still applied, so (for example) illegal byte sequences will be detected. -v, --version Output program version and exit. SUPPORTED BOM TYPES The supported BOM types are: NONE No supported BOM was detected. UTF-7 A UTF-7 BOM was detected. UTF-8 A UTF-8 BOM was detected. UTF-16BE A UTF-16 (Big Endian) BOM was detected. UTF-16LE A UTF-16 (Little Endian) BOM was detected. UTF-32BE A UTF-32 (Big Endian) BOM was detected. UTF-32LE A UTF-32 (Little Endian) BOM was detected. GB18030 A GB18030 (Chinese National Standard) BOM was detected. EXAMPLES To tell what kind of byte order mark a file has: $ bom --detect To normalize files with byte order marks into UTF-8, and pass other files through unchanged: $ bom --strip --utf8 Same as previous example, but discard illegal byte sequences instead of gener- ating an error: $ bom --strip --utf8 --lenient To verify a properly encoded UTF-8 or UTF-16 file with a byte-order-mark and output it as UTF-8: $ bom --strip --utf8 --expect UTF-8,UTF-16LE,UTF-16BE To just remove any byte order mark and get on with your life: $ bom --strip file RETURN VALUES bom exits with one of the following values: 0 Success. 1 A general error occurred. 2 The --expect flag was given but the detected BOM did not match. 3 An illegal byte sequence was detected (and --lenient was not speci- fied). SEE ALSO iconv(1) bom: Decode Unicode byte order mark, https://github.com/archiecobbs/bom. AUTHOR Archie L. Cobbs <archie.cobbs@gmail.com> BSD October 14, 2021 BSD ``` 07070100000006000081ED000003E8000000640000000161830F5E0000039B000000000000000000000000000000000000001500000000bom-1.0.1/autogen.sh#!/bin/bash # # Script to regenerate all the GNU auto* gunk. # Run this from the top directory of the source tree. # # If it looks like I don't know what I'm doing here, you're right. # set -e echo "cleaning up" rm -rf autom4te*.cache scripts aclocal.m4 configure config.log config.status .deps stamp-h1 rm -f config.h.in config.h.in~ config.h rm -rf scripts find . \( -name Makefile -o -name Makefile.in \) -print0 | xargs -0 rm -f rm -f *.o bom bom.1 bom-*.tar.gz gitrev.c rm -rf a.out.* tags if [ "${1}" = '-C' ]; then exit 0 fi ACLOCAL="aclocal" AUTOHEADER="autoheader" AUTOMAKE="automake" AUTOCONF="autoconf" echo "running aclocal" mkdir scripts ${ACLOCAL} ${ACLOCAL_ARGS} -I scripts echo "running autoheader" ${AUTOHEADER} echo "running automake" ${AUTOMAKE} --add-missing -c --foreign echo "running autoconf" ${AUTOCONF} -f -i if [ "${1}" = '-c' ]; then echo "running configure" ./configure fi 07070100000007000081A4000003E8000000640000000161830F5E000012B1000000000000000000000000000000000000001300000000bom-1.0.1/bom.1.in.\" -*- nroff -*- .\" .\" bom - Deals with Unicode byte order marks .\" .\" Copyright (C) 2021 Archie L. Cobbs. All rights reserved. .\" .\" Licensed under the Apache License, Version 2.0 (the "License"); .\" you may not use this file except in compliance with the License. .\" You may obtain a copy of the License at .\" .\" http://www.apache.org/licenses/LICENSE-2.0 .\" .\" Unless required by applicable law or agreed to in writing, software .\" distributed under the License is distributed on an "AS IS" BASIS, .\" WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. .\" See the License for the specific language governing permissions and .\" limitations under the License. .\" .Dd October 14, 2021 .Dt BOM 1 .Os .Sh NAME .Nm bom .Nd Decode Unicode byte order mark .Sh SYNOPSIS .Nm .Fl \-strip .Op Fl \-expect Ar types .Op Fl \-lenient .Op Fl \-prefer32 .Op Fl \-utf8 .Op Ar file .Nm .Fl \-detect .Op Fl \-expect Ar types .Op Fl \-prefer32 .Op Ar file .Nm .Fl \-print Ar type .Nm .Fl \-list .Nm .Fl \-help .Nm .Fl \-version .Sh DESCRIPTION .Nm decodes, verifies, reports, and/or strips the byte order mark (BOM) at the start of the specified file, if any. .Pp When no .Ar file is specified, or when .Ar file is \-, read standard input. .Sh OPTIONS .Bl -tag -width Ds .It Fl d , Fl \-detect Report the detected BOM type to standard output and then exit. .Pp See .Sx "SUPPORTED BOM TYPES" for possible values. .It Fl e , Fl \-expect Ar types Expect to find one of the specified BOM types, otherwise exit with an error. .Pp Multiple types may be specified, separated by commas. .Pp Specifying .Ar NONE is acceptable and matches when the file has no (supported) BOM. .It Fl h , Fl \-help Output command line usage help. .It Fl l , Fl \-lenient Silently ignore any illegal byte sequences encountered when converting the remainder of the file to UTF-8. .Pp Without this flag, .Nm will exit immediately with an error if an illegal byte sequence is encountered. .Pp This flag has no effect unless the .Fl \-utf8 flag is given. .It Fl \-list List the supported BOM types and exit. .It Fl p , Fl \-print Ar type Output the byte sequence corresponding to the .Ar type byte order mark. .It Fl \-prefer32 Used to disambiguate the byte sequence .Ar "FF FE 00 00" , which can be either a .Ar UTF-32LE BOM or a .Ar UTF-16LE BOM followed by a NUL character. .Pp Without this flag, .Ar UTF-16LE is assumed; with this flag, .Ar UTF-32LE is assumed. .It Fl s , Fl \-strip Strip the BOM, if any, from the beginning of the file and output the remainder of the file. .It Fl u , Fl \-utf8 Convert the remainder of the file to UTF-8, assuming the character encoding implied by the detected BOM. .Pp For files with no (supported) BOM, this flag has no effect and the remainder of the file is copied unmodified. .Pp For files with a UTF-8 BOM, the identity transformation is still applied, so (for example) illegal byte sequences will be detected. .It Fl v , Fl \-version Output program version and exit. .El .Sh SUPPORTED BOM TYPES The supported BOM types are: .Bl -tag -width Ds .It NONE No supported BOM was detected. .It UTF-7 A UTF-7 BOM was detected. .It UTF-8 A UTF-8 BOM was detected. .It UTF-16BE A UTF-16 (Big Endian) BOM was detected. .It UTF-16LE A UTF-16 (Little Endian) BOM was detected. .It UTF-32BE A UTF-32 (Big Endian) BOM was detected. .It UTF-32LE A UTF-32 (Little Endian) BOM was detected. .It GB18030 A GB18030 (Chinese National Standard) BOM was detected. .El .Sh EXAMPLES .Pp To tell what kind of byte order mark a file has: .Bd -literal -offset indent $ bom --detect file .Ed .Pp To normalize files with byte order marks into UTF-8, and pass other files through unchanged: .Bd -literal -offset indent $ bom --strip --utf8 file .Ed .Pp Same as previous example, but discard illegal byte sequences instead of generating an error: .Bd -literal -offset indent $ bom --strip --utf8 --lenient file .Ed .Pp To verify a properly encoded UTF-8 or UTF-16 file with a byte-order-mark and output it as UTF-8: .Bd -literal -offset indent $ bom --strip --utf8 --expect UTF-8,UTF-16LE,UTF-16BE file .Ed .Pp To just remove any byte order mark and get on with your life: .Bd -literal -offset indent $ bom --strip file .Ed .Sh RETURN VALUES .Nm exits with one of the following values: .Bl -tag -width Ds .It 0 Success. .It 1 A general error occurred. .It 2 The .Fl \-expect flag was given but the detected BOM did not match. .It 3 An illegal byte sequence was detected (and .Fl \-lenient was not specified). .El .Sh SEE ALSO .Xr iconv 1 .Rs .%T "bom: Decode Unicode byte order mark" .%O https://github.com/archiecobbs/bom .Re .Rs .%T "Byte order mark (Wikipedia)" .%O https://en.wikipedia.org/wiki/Byte_order_mark .Re .Sh AUTHOR .An Archie L. Cobbs Aq archie.cobbs@gmail.com 07070100000008000081A4000003E8000000640000000161830F5E00000A07000000000000000000000000000000000000001700000000bom-1.0.1/configure.ac# # bom - Deals with Unicode byte order marks # # Copyright (C) 2021 Archie L. Cobbs. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # AC_INIT([bom Deals with Unicode byte order marks], [1.0.1], [https://github.com/archiecobbs/bom/], [bom]) AC_CONFIG_AUX_DIR(scripts) AM_INIT_AUTOMAKE dnl AM_MAINTAINER_MODE AC_PREREQ(2.59) AC_PREFIX_DEFAULT(/usr) AC_PROG_MAKE_SET [CFLAGS="-g -O3 -pipe -Wall -Waggregate-return -Wcast-align -Wchar-subscripts -Wcomment -Wformat -Wimplicit -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wno-long-long -Wparentheses -Wpointer-arith -Wredundant-decls -Wreturn-type -Wswitch -Wtrigraphs -Wuninitialized -Wunused -Wwrite-strings -Wshadow -Wstrict-prototypes -Wcast-qual $CFLAGS"] AC_SUBST(CFLAGS) # Compile flags for Linux AC_DEFINE(_DEFAULT_SOURCE, 1, Default functions) AC_DEFINE(_GNU_SOURCE, 1, GNU functions) AC_DEFINE(_BSD_SOURCE, 1, BSD functions) AC_DEFINE(_XOPEN_SOURCE, 500, XOpen functions) # Compile flags for Mac OS AC_DEFINE(_DARWIN_C_SOURCE, 1, MacOS functions) # Check for required programs AC_PROG_INSTALL AC_PROG_CC AC_PATH_PROG([CAT], [cat], [], []) if test "x${CAT}" = "x"; then AC_MSG_ERROR[cat not found] fi AC_PATH_PROG([SED], [sed], [], []) if test "x${SED}" = "x"; then AC_MSG_ERROR[sed not found] fi # Check for required libc functions AC_SEARCH_LIBS([iconv_open], [iconv],, [if test `uname -o` = 'Cygwin' -a -f /usr/lib/libiconv.a; then LIBS="-liconv ${LIBS}"; else AC_MSG_ERROR([required function iconv_open missing]); fi]) # Check for required header files AC_HEADER_STDC AC_CHECK_HEADERS(ctype.h errno.h stdio.h stdlib.h string.h unistd.h sys/stat.h sys/types.h, [], [AC_MSG_ERROR([required header file '$ac_header' missing])]) # Optional features AC_ARG_ENABLE(Werror, AC_HELP_STRING([--enable-Werror], [enable compilation with -Werror flag (default NO)]), [test x"$enableval" = "xyes" && CFLAGS="${CFLAGS} -Werror"]) # Generated files AC_CONFIG_FILES(Makefile) AC_CONFIG_FILES(bom.1) AM_CONFIG_HEADER(config.h) # Go AC_OUTPUT 07070100000009000081A4000003E8000000640000000161830F5E000049E6000000000000000000000000000000000000001100000000bom-1.0.1/main.c/* * bom - Deals with Unicode byte order marks * * Copyright (C) 2021 Archie L. Cobbs. All rights reserved. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ #include <assert.h> #include <ctype.h> #include <err.h> #include <errno.h> #include <getopt.h> #include <iconv.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> // Copyright character #define COPYRIGHT "\xc2\xa9" // Special exit values #define EX_EXPECT_FAIL 2 #define EX_ILLEGAL_BYTES 3 // Version string extern const char *const bom_version; // Command line options that only have long versions #define FLAG_LIST (-2) #define FLAG_PREFER_32 (-3) #define OPT(_letter, _name, _arg) \ { \ .name= _name, \ .has_arg= _arg, \ .flag= NULL, \ .val= _letter \ } static const struct option long_options[] = { OPT('d', "detect", no_argument), OPT('e', "expect", required_argument), OPT('h', "help", no_argument), OPT(FLAG_LIST, "list", no_argument), OPT('l', "lenient", no_argument), OPT('p', "print", required_argument), OPT(FLAG_PREFER_32, "prefer32", no_argument), OPT('s', "strip", no_argument), OPT('u', "utf8", no_argument), OPT('v', "version", no_argument), OPT(0, NULL, 0) }; // Execution modes #define MODE_STRIP 1 #define MODE_DETECT 2 #define MODE_LIST 3 #define MODE_PRINT 4 #define MODE_HELP 5 #define MODE_VERSION 6 // BOM types struct bom_type { const char *name; const char *encoding; const char *bytes; const int len; }; #define BOM_TYPE(_name, _encoding, _bytes) \ { \ .name= _name, \ .encoding= _encoding, \ .bytes= _bytes, \ .len= sizeof(_bytes) - 1 \ } static const struct bom_type bom_types[] = { BOM_TYPE("NONE", NULL, ""), BOM_TYPE("UTF-7", "UTF-7", "\x2b\x2f\x76"), BOM_TYPE("UTF-8", "UTF-8", "\xef\xbb\xbf"), BOM_TYPE("UTF-16BE", "UTF-16BE", "\xfe\xff"), BOM_TYPE("UTF-16LE", "UTF-16LE", "\xff\xfe"), BOM_TYPE("UTF-32BE", "UTF-32BE", "\x00\x00\xfe\xff"), BOM_TYPE("UTF-32LE", "UTF-32LE", "\xff\xfe\x00\x00"), BOM_TYPE("GB18030", "GB18030", "\x84\x31\x95\x33"), }; #define BOM_TYPE_NONE 0 #define BOM_TYPE_UTF_7 1 #define BOM_TYPE_UTF_8 2 #define BOM_TYPE_UTF_16BE 3 #define BOM_TYPE_UTF_16LE 4 #define BOM_TYPE_UTF_32BE 5 #define BOM_TYPE_UTF_32LE 6 #define BOM_TYPE_GB18030 7 #define BOM_TYPE_MAX 8 // Input buffer #define BUFFER_SIZE 1024 struct bom_input { char buf[BUFFER_SIZE]; int len; int num_complete; int num_finished; int match_state[BOM_TYPE_MAX]; }; #define MATCH_PREFIX 0 #define MATCH_COMPLETE 1 #define MATCH_FAILED 2 // Mode of execution functions static void bom_detect(FILE *fp, long expect_types, int prefer32); static void bom_strip(FILE *fp, long expect_types, int lenient, int prefer32, int utf8); static void bom_list(void); static void bom_print(int bom_type); // Helper functions static int read_bom(FILE *fp, struct bom_input *const input, long expect_types, int prefer32); static int read_byte(FILE *fp, struct bom_input *input); static int bom_type_from_name(const char *name); static void init_bom_input(struct bom_input *const input); static void set_mode(int *modep, int mode); static void usage(void); int main(int argc, char **argv) { const struct option *opt; char optstring[32]; long expect_types = 0; int option_index; int bom_type = -1; int prefer32 = 0; int lenient = 0; FILE *fp = NULL; int mode = 0; int utf8 = 0; char *s; int ch; // Build optstring dynamically s = optstring; for (opt = long_options; opt->name != NULL; opt++) { if (opt->val > 0) { *s++ = (char)opt->val; if (opt->has_arg) *s++ = ':'; } } *s = '\0'; // Parse command line while ((ch = getopt_long(argc, argv, optstring, long_options, &option_index)) != -1) { switch (ch) { case 'd': set_mode(&mode, MODE_DETECT); break; case 'e': while ((s = strsep(&optarg, ",")) != NULL) { if ((bom_type = bom_type_from_name(s)) >= sizeof(expect_types) * 8) errx(1, "internal error: %s", "too many BOM types"); expect_types |= (1 << bom_type); } break; case 'h': set_mode(&mode, MODE_HELP); break; case 'l': lenient = 1; break; case 'p': bom_type = bom_type_from_name(optarg); set_mode(&mode, MODE_PRINT); break; case 's': set_mode(&mode, MODE_STRIP); break; case 'u': utf8 = 1; break; case 'v': set_mode(&mode, MODE_VERSION); break; case FLAG_PREFER_32: prefer32 = 1; break; case FLAG_LIST: set_mode(&mode, MODE_LIST); break; case '?': default: usage(); return 1; } } argv += optind; argc -= optind; // Parse remainder of command line switch (mode) { case MODE_STRIP: case MODE_DETECT: switch (argc) { case 0: fp = stdin; break; case 1: if (strcmp(argv[0], "-") == 0) { fp = stdin; break; } if ((fp = fopen(argv[0], "r")) == NULL) err(1, "%s", argv[0]); break; default: usage(); return 1; } break; default: switch (argc) { case 0: break; default: usage(); return 1; } break; } // Execute switch (mode) { case MODE_STRIP: bom_strip(fp, expect_types, lenient, prefer32, utf8); break; case MODE_DETECT: bom_detect(fp, expect_types, prefer32); break; case MODE_LIST: bom_list(); break; case MODE_PRINT: bom_print(bom_type); break; case MODE_HELP: usage(); break; case MODE_VERSION: fprintf(stderr, "bom %s\n", bom_version); fprintf(stderr, "Copyright %s Archie L. Cobbs. All rights reserved.\n", COPYRIGHT); break; default: usage(); return 1; } // Done return 0; } static void bom_detect(FILE *fp, long expect_types, int prefer32) { const struct bom_type *bt; struct bom_input input; int bom_type; // Read BOM init_bom_input(&input); bom_type = read_bom(fp, &input, expect_types, prefer32); bt = &bom_types[bom_type]; // Print its name printf("%s\n", bt->name); } #if DEBUG_ICONV_OPS #define BYTES_PER_ROW 20 static void debug_buffer(const size_t base, const void *data, size_t len) { size_t offset; size_t i; if (data == NULL) { fprintf(stderr, " NULL\n"); return; } for (offset = 0; offset < len; offset += BYTES_PER_ROW) { fprintf(stderr, "%08d: ", (unsigned int)(base + offset)); for (i = 0; i < BYTES_PER_ROW; i++) { const int val = offset + i < len ? *((const char *)data + offset + i) & 0xff : -1; if (i == BYTES_PER_ROW / 2) fprintf(stderr, " "); if (val != -1) fprintf(stderr, " %02x", val); else fprintf(stderr, " "); } fprintf(stderr, " "); for (i = 0; i < BYTES_PER_ROW; i++) { const int val = offset + i < len ? *((const char *)data + offset + i) & 0xff : -1; if (val != -1) fprintf(stderr, "%c", isprint(val) ? val : '.'); else fprintf(stderr, " "); } fprintf(stderr, "\n"); } } #endif /* DEBUG_ICONV_OPS */ static void bom_strip(FILE *fp, long expect_types, int lenient, int prefer32, int utf8) { const struct bom_type *bt; struct bom_input input; char ibuf[BUFFER_SIZE]; char obuf[BUFFER_SIZE]; char tocode[32]; size_t offset; iconv_t icd = 0; int done = 0; int bom_type; int ilen; // Read BOM init_bom_input(&input); bom_type = read_bom(fp, &input, expect_types, prefer32); bt = &bom_types[bom_type]; // If BOM type is NONE, then obviously we can't convert to UTF-8 if (bom_type == BOM_TYPE_NONE) utf8 = 0; // Initialize iconv conversion engine if (utf8) { snprintf(tocode, sizeof(tocode), "%s%s", bom_types[BOM_TYPE_UTF_8].encoding, lenient ? "//IGNORE" : ""); if ((icd = iconv_open(tocode, bt->encoding)) == (iconv_t)-1) err(1, "iconv: \"%s\" -> \"%s\"", bt->encoding, tocode); } // Copy over any bytes we read after the BOM into our input buffer ilen = input.len - bt->len; memcpy(ibuf, input.buf + bt->len, ilen); offset = bt->len; // Convert remainder of file while (!done) { size_t nread; size_t nwrit; char *iptr; char *optr; size_t iremain; size_t oremain; int eof = 0; size_t r; // Fill the input buffer while (ilen < sizeof(ibuf)) { if ((nread = fread(ibuf + ilen, 1, sizeof(ibuf) - ilen, fp)) == 0) { if (ferror(fp)) err(1, "read error"); eof = 1; break; } ilen += nread; } // When the input buffer is empty and we couldn't add anything more, this is the last round done = ilen == 0; // Convert bytes (unless BOM_TYPE_NONE) iptr = ibuf; optr = obuf; iremain = ilen; oremain = sizeof(obuf); // Convert to UTF-8 or just pass through if (utf8) { #if DEBUG_ICONV_OPS fprintf(stderr, "->iconv@%d: ilen=%d\n", (int)offset, (int)ilen); debug_buffer(offset, iptr, ilen); #endif r = iconv(icd, !done ? &iptr : NULL, &iremain, &optr, &oremain); #if DEBUG_ICONV_OPS { const int errno_save = errno; fprintf(stderr, "<-iconv@%d: r=%d errno=%d iptr@%d optr@%d\n", (int)offset, (int)r, errno, (int)(iptr - ibuf), (int)(optr - obuf)); debug_buffer(offset, obuf, optr - obuf); errno = errno_save; } #endif if (r == (size_t)-1) { switch (errno) { case EINVAL: // incomplete multi-byte sequence at the end of the input buffer if (!done && !eof) break; // FALLTHROUGH case EILSEQ: // an invalid byte sequence was detected if (lenient) { iptr += iremain; // avoid an infinite loop on trailing partial multi-byte sequence iremain = 0; break; } errx(EX_ILLEGAL_BYTES, "invalid %s byte sequence at file offset %lu", bt->name, offset + (iptr - ibuf)); default: err(1, "iconv"); } } } else { // behave like iconv() would but just copy the bytes memcpy(optr, iptr, ilen); if (!done) iptr += ilen; iremain = 0; optr += ilen; oremain -= ilen; } // Update file offset offset += ilen - iremain; // Shift unprocessed input for next time memmove(ibuf, iptr, iremain); ilen = iremain; // Write output oremain = optr - obuf; optr = obuf; while (oremain > 0 && (nwrit = fwrite(optr, 1, oremain, stdout)) > 0) { optr += nwrit; oremain -= nwrit; } if (ferror(stdout)) err(1, "write error"); } if (fflush(stdout) == EOF) err(1, "write error"); // Close conversion if (utf8) (void)iconv_close(icd); } static void bom_list(void) { int bom_type; for (bom_type = 0; bom_type < BOM_TYPE_MAX; bom_type++) { const struct bom_type *const bt = &bom_types[bom_type]; printf("%s\n", bt->name); } } static void bom_print(int bom_type) { const struct bom_type *const bt = &bom_types[bom_type]; int i; for (i = 0; i < bt->len; i++) { if (putchar(bt->bytes[i] & 0xff) == EOF) err(1, "write error"); } } static int read_bom(FILE *fp, struct bom_input *const input, long expect_types, int prefer32) { int bom_type; // Read bytes until all BOM's are either completely matched or have failed to match while (read_byte(fp, input)) { if (input->num_finished == BOM_TYPE_MAX) break; } // Handle the UTF-16LE vs. UTF-32LE ambiguity if (input->match_state[BOM_TYPE_UTF_16LE] == MATCH_COMPLETE && input->match_state[BOM_TYPE_UTF_32LE] == MATCH_COMPLETE) { input->match_state[prefer32 ? BOM_TYPE_UTF_16LE : BOM_TYPE_UTF_32LE] = MATCH_FAILED; input->num_complete--; } // At this point there should be BOM_TYPE_NONE and at most one other match assert(input->match_state[BOM_TYPE_NONE] == MATCH_COMPLETE); switch (input->num_complete) { case 1: bom_type = BOM_TYPE_NONE; break; case 2: for (bom_type = 0; bom_type < BOM_TYPE_MAX; bom_type++) { if (bom_type != BOM_TYPE_NONE && input->match_state[bom_type] == MATCH_COMPLETE) break; } if (bom_type < BOM_TYPE_MAX) break; // FALLTHROUGH default: errx(1, "internal error: %s", ">2 BOM type matches"); } // Check expected BOM type if (expect_types != 0 && (expect_types & (1 << bom_type)) == 0) errx(EX_EXPECT_FAIL, "unexpected BOM type %s", bom_types[bom_type].name); // Done return bom_type; } static int bom_type_from_name(const char *name) { int bom_type; for (bom_type = 0; bom_type < BOM_TYPE_MAX; bom_type++) { if (strcmp(bom_types[bom_type].name, name) == 0) return bom_type; } errx(1, "unknown BOM type \"%s\"", name); } static int read_byte(FILE *fp, struct bom_input *const input) { int bom_type; int ch; // Read next byte if ((ch = getc(fp)) == EOF) { if (ferror(fp)) err(1, "read error"); return 0; } // Update state if (input->len >= sizeof(input->buf)) errx(1, "internal error: %s", "input buffer overflow"); for (bom_type = 0; bom_type < BOM_TYPE_MAX; bom_type++) { const struct bom_type *const bt = &bom_types[bom_type]; switch (input->match_state[bom_type]) { case MATCH_PREFIX: if (bt->bytes[input->len] != (char)ch) { input->match_state[bom_type] = MATCH_FAILED; input->num_finished++; } else if (bt->len == input->len + 1) { input->match_state[bom_type] = MATCH_COMPLETE; input->num_finished++; input->num_complete++; } break; case MATCH_COMPLETE: case MATCH_FAILED: break; default: errx(1, "internal error: %s", "invalid match state"); } } input->buf[input->len++] = (char)ch; return 1; } static void init_bom_input(struct bom_input *const input) { memset(input, 0, sizeof(*input)); input->match_state[BOM_TYPE_NONE] = MATCH_COMPLETE; input->num_complete = 1; input->num_finished = 1; } static void set_mode(int *modep, int mode) { if (*modep != 0) { usage(); exit(1); } *modep = mode; } static void usage(void) { fprintf(stderr, "Usage:\n"); fprintf(stderr, " bom --strip [--expect types] [--lenient] [--prefer32] [--utf8] [file]\n"); fprintf(stderr, " bom --detect [--expect types] [--prefer32] [file]\n"); fprintf(stderr, " bom --list\n"); fprintf(stderr, " bom --print type\n"); fprintf(stderr, " bom --help\n"); fprintf(stderr, " bom --version\n"); fprintf(stderr, "Options:\n"); fprintf(stderr, " -d, --detect Report the detected BOM type and exit\n"); fprintf(stderr, " -e, --expect types Expect the specified BOM type(s) (separated by commas)\n"); fprintf(stderr, " -h, --help Output command line usage summary\n"); fprintf(stderr, " -l, --lenient Skip invalid input byte sequences instead of failing\n"); fprintf(stderr, " --list List the supported BOM types\n"); fprintf(stderr, " -p, --print type Output the byte sequence corresponding to \"type\"\n"); fprintf(stderr, " --prefer32 Prefer UTF-32LE instead of UTF-16LE followed by NUL\n"); fprintf(stderr, " -s, --strip Strip the BOM and output the remainder of the file\n"); fprintf(stderr, " -u, --utf8 Convert the remainder of the file to UTF-8\n"); fprintf(stderr, " -v, --version Output program version and exit\n"); } 0707010000000A000081ED000003E8000000640000000161830F5E00000156000000000000000000000000000000000000001500000000bom-1.0.1/manpage.sh#!/bin/bash # Bail on error set -e NCOLS="83" MANPAGE="bom.1.in" sed '/man page/q' < README.md > README.md.NEW printf '```\n' >> README.md.NEW groff -r LL=${NCOLS}n -r LT=${NCOLS}n -Tlatin1 -man "${MANPAGE}" \ | sed -r -e 's/.\x08(.)/\1/g' -e 's/[[0-9]+m//g' \ >> README.md.NEW printf '```\n' >> README.md.NEW mv README.md{.NEW,} 0707010000000B000041ED000003E8000000640000000261830F5E00000000000000000000000000000000000000000000001000000000bom-1.0.1/tests0707010000000C000081ED000003E8000000640000000161830F5E00000CEA000000000000000000000000000000000000001700000000bom-1.0.1/tests/run.sh#!/bin/bash # Bail on error set -e # Setup temporary files TMP_STDOUT_EXPECTED='bom-test-out-expected.tmp' TMP_STDERR_EXPECTED='bom-test-err-expected.tmp' TMP_STDOUT_ACTUAL='bom-test-out-actual.tmp' TMP_STDERR_ACTUAL='bom-test-err-actual.tmp' TMP_SWAP_FILE=''bom-test-hexdump.tmp trap "rm -f \ ${TMP_STDOUT_EXPECTED} \ ${TMP_STDERR_EXPECTED} \ ${TMP_STDOUT_ACTUAL} \ ${TMP_STDERR_ACTUAL} \ ${TMP_SWAP_FILE}" 0 2 3 5 10 13 15 # Convert a file to hexdump version hexdumpify() { FILE="${1}" hexdump -C < "${FILE}" > "${TMP_SWAP_FILE}" mv "${TMP_SWAP_FILE}" "${FILE}" } # Compare files, on failure set ${DIFF_FAIL} checkdiff() { if [ "${1}" = '-h' ]; then HEXDUMPIFY='true' shift else HEXDUMPIFY='false' fi TESTFILE="${1}" WHAT="${2}" EXPECTED="${3}" ACTUAL="${4}" if diff -q "${EXPECTED}" "${ACTUAL}" >/dev/null; then return 0 fi echo "test: ${TESTFILE}: ${WHAT} mismatch" echo '------------------------------------------------------' if [ "${HEXDUMPIFY}" = 'true' ]; then hexdumpify "${EXPECTED}" hexdumpify "${ACTUAL}" fi diff -u "${EXPECTED}" "${ACTUAL}" || true echo '------------------------------------------------------' DIFF_FAIL='true' } # Execute one test, on failure set ${TEST_FAIL} runtest() { # Read test data unset FLAGS unset STDIN unset STDOUT unset STDERR unset EXITVAL . "${TESTFILE}" if [ -z "${FLAGS+x}" \ -o -z "${STDIN+x}" \ -o -z "${STDOUT+x}" \ -o -z "${STDERR+x}" \ -o -z "${EXITVAL+x}" ]; then echo "test: ${TESTFILE}: invalid test file" exit 1 fi # Set up files echo -en "${STDOUT}" > "${TMP_STDOUT_EXPECTED}" echo -en "${STDERR}" > "${TMP_STDERR_EXPECTED}" set +e echo -en "${STDIN}" | ../bom ${FLAGS} >"${TMP_STDOUT_ACTUAL}" 2>"${TMP_STDERR_ACTUAL}" ACTUAL_EXITVAL="$?" set -e # Special hacks if [ "${STDERR}" = '!USAGE!' ]; then ../bom --help 2>"${TMP_STDERR_EXPECTED}" fi # Check result DIFF_FAIL='false' checkdiff -h "${TESTFILE}" "standard output" "${TMP_STDOUT_EXPECTED}" "${TMP_STDOUT_ACTUAL}" checkdiff "${TESTFILE}" "standard error" "${TMP_STDERR_EXPECTED}" "${TMP_STDERR_ACTUAL}" if [ "${DIFF_FAIL}" != 'false' ]; then TEST_FAIL='true' fi if [ "${ACTUAL_EXITVAL}" -ne "${EXITVAL}" ]; then echo "test: ${TESTFILE}: exit value ${ACTUAL_EXITVAL} != ${EXITVAL}" TEST_FAIL='true' fi # Print success or if failure show params if [ "${TEST_FAIL}" = 'false' ]; then echo "test: ${TESTFILE}: success" else echo "******************************************************" echo "test: ${TESTFILE} FAILED with:" echo " FLAGS='${FLAGS}'" echo " STDIN='${STDIN}'" echo "******************************************************" fi } # Find all tests and run them ANY_FAIL='false' for TESTFILE in `find . -maxdepth 1 -type f -name 'test-*.tst' | sort | sed 's|^./||g'`; do TEST_FAIL='false' runtest "${TESTFILE}" if [ "${TEST_FAIL}" != 'false' ]; then ANY_FAIL='true' fi done # Exit with error if any test failed if [ "${ANY_FAIL}" != 'false' ]; then exit 1 fi 0707010000000D000081A4000003E8000000640000000161830F5E00000040000000000000000000000000000000000000002600000000bom-1.0.1/tests/test-detect-empty.tstFLAGS='--detect' STDIN='' STDOUT='NONE\n' STDERR='' EXITVAL='0' 0707010000000E000081A4000003E8000000640000000161830F5E00000064000000000000000000000000000000000000002B00000000bom-1.0.1/tests/test-detect-expect-001.tstFLAGS='--detect --expect UTF-8' STDIN='\xef\xbb\xbfblahblah' STDOUT='UTF-8\n' STDERR='' EXITVAL='0' 0707010000000F000081A4000003E8000000640000000161830F5E00000080000000000000000000000000000000000000002B00000000bom-1.0.1/tests/test-detect-expect-002.tstFLAGS='--detect --expect UTF-16LE' STDIN='\xef\xbb\xbfblahblah' STDOUT='' STDERR='bom: unexpected BOM type UTF-8\n' EXITVAL='2' 07070100000010000081A4000003E8000000640000000161830F5E00000044000000000000000000000000000000000000002800000000bom-1.0.1/tests/test-detect-partial.tstFLAGS='--detect' STDIN='\xff' STDOUT='NONE\n' STDERR='' EXITVAL='0' 07070100000011000081A4000003E8000000640000000161830F5E0000007D000000000000000000000000000000000000002400000000bom-1.0.1/tests/test-list-types.tstFLAGS='--list' STDIN='' STDOUT='NONE\nUTF-7\nUTF-8\nUTF-16BE\nUTF-16LE\nUTF-32BE\nUTF-32LE\nGB18030\n' STDERR='' EXITVAL='0' 07070100000012000081A4000003E8000000640000000161830F5E0000004E000000000000000000000000000000000000002600000000bom-1.0.1/tests/test-prefer32-001.tstFLAGS='-d' STDIN='\xff\xfe\x00\x00' STDOUT='UTF-16LE\n' STDERR='' EXITVAL='0' 07070100000013000081A4000003E8000000640000000161830F5E00000059000000000000000000000000000000000000002600000000bom-1.0.1/tests/test-prefer32-002.tstFLAGS='-d --prefer32' STDIN='\xff\xfe\x00\x00' STDOUT='UTF-32LE\n' STDERR='' EXITVAL='0' 07070100000014000081A4000003E8000000640000000161830F5E00000051000000000000000000000000000000000000002700000000bom-1.0.1/tests/test-print-GB18030.tstFLAGS='--print GB18030' STDIN='' STDOUT='\x84\x31\x95\x33' STDERR='' EXITVAL='0' 07070100000015000081A4000003E8000000640000000161830F5E0000003E000000000000000000000000000000000000002400000000bom-1.0.1/tests/test-print-NONE.tstFLAGS='--print NONE' STDIN='' STDOUT='' STDERR='' EXITVAL='0' 07070100000016000081A4000003E8000000640000000161830F5E00000062000000000000000000000000000000000000002700000000bom-1.0.1/tests/test-print-UNKNOWN.tstFLAGS='--print UNKNOWN' STDIN='' STDOUT='' STDERR='bom: unknown BOM type "UNKNOWN"\n' EXITVAL='1' 07070100000017000081A4000003E8000000640000000161830F5E0000004A000000000000000000000000000000000000002800000000bom-1.0.1/tests/test-print-UTF-16BE.tstFLAGS='--print UTF-16BE' STDIN='' STDOUT='\xfe\xff' STDERR='' EXITVAL='0' 07070100000018000081A4000003E8000000640000000161830F5E0000004A000000000000000000000000000000000000002800000000bom-1.0.1/tests/test-print-UTF-16LE.tstFLAGS='--print UTF-16LE' STDIN='' STDOUT='\xff\xfe' STDERR='' EXITVAL='0' 07070100000019000081A4000003E8000000640000000161830F5E00000052000000000000000000000000000000000000002800000000bom-1.0.1/tests/test-print-UTF-32BE.tstFLAGS='--print UTF-32BE' STDIN='' STDOUT='\x00\x00\xfe\xff' STDERR='' EXITVAL='0' 0707010000001A000081A4000003E8000000640000000161830F5E00000052000000000000000000000000000000000000002800000000bom-1.0.1/tests/test-print-UTF-32LE.tstFLAGS='--print UTF-32LE' STDIN='' STDOUT='\xff\xfe\x00\x00' STDERR='' EXITVAL='0' 0707010000001B000081A4000003E8000000640000000161830F5E0000004B000000000000000000000000000000000000002500000000bom-1.0.1/tests/test-print-UTF-7.tstFLAGS='--print UTF-7' STDIN='' STDOUT='\x2b\x2f\x76' STDERR='' EXITVAL='0' 0707010000001C000081A4000003E8000000640000000161830F5E0000004B000000000000000000000000000000000000002500000000bom-1.0.1/tests/test-print-UTF-8.tstFLAGS='--print UTF-8' STDIN='' STDOUT='\xef\xbb\xbf' STDERR='' EXITVAL='0' 0707010000001D000081A4000003E8000000640000000161830F5E0000008E000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-001.tstFLAGS='--strip --utf8' STDIN='\xef\xbb\xbftest123\xff456' STDOUT='' STDERR='bom: invalid UTF-8 byte sequence at file offset 10\n' EXITVAL='3' 0707010000001E000081A4000003E8000000640000000161830F5E0000006E000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-002.tstFLAGS='--strip --lenient --utf8' STDIN='\xef\xbb\xbftest123\xff456' STDOUT='test123456' STDERR='' EXITVAL='0' 0707010000001F000081A4000003E8000000640000000161830F5E00000061000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-003.tstFLAGS='--strip' STDIN='\xef\xbb\xbftest123\xff456' STDOUT='test123\xff456' STDERR='' EXITVAL='0' 07070100000020000081A4000003E8000000640000000161830F5E000000F1000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-004.tst# The input is truncated after 2/3 of a rightwards arrow U2192 -> e2 86 92 FLAGS='--strip --expect UTF-8 --utf8' STDIN='\xef\xbb\xbfpartial arrow: \xe2\x86' STDOUT='' STDERR='bom: invalid UTF-8 byte sequence at file offset 18\n' EXITVAL='3' 07070100000021000081A4000003E8000000640000000161830F5E000000D6000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-005.tst# The input is truncated after 2/3 of a rightwards arrow U2192 -> e2 86 92 FLAGS='--strip --expect UTF-8 --utf8 --lenient' STDIN='\xef\xbb\xbfpartial arrow: \xe2\x86' STDOUT='partial arrow: ' STDERR='' EXITVAL='0' 07070100000022000081A4000003E8000000640000000161830F5E0000014B000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-006.tst# This has a multi-byte sequence that crosses our input buffer boundary FLAGS='--strip --expect UTF-8 --utf8' STDIN_BOM='\xef\xbb\xbf' STDIN_1019=`yes aaaaaaaaaaaaaaa | tr -d \\\\n | head -c 1023` STDIN_ARROW='\xe2\x86\x92' STDIN="${STDIN_BOM}${STDIN_1019}${STDIN_ARROW}" STDOUT="${STDIN_1019}${STDIN_ARROW}" STDERR='' EXITVAL='0' 07070100000023000081A4000003E8000000640000000161830F5E00000039000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-001.tstFLAGS='' STDIN='' STDOUT='' STDERR='!USAGE!' EXITVAL='1' 07070100000024000081A4000003E8000000640000000161830F5E00000049000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-002.tstFLAGS='--strip --detect' STDIN='' STDOUT='' STDERR='!USAGE!' EXITVAL='1' 07070100000025000081A4000003E8000000640000000161830F5E00000048000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-003.tstFLAGS='--detect --list' STDIN='' STDOUT='' STDERR='!USAGE!' EXITVAL='1' 07070100000026000081A4000003E8000000640000000161830F5E0000004C000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-004.tstFLAGS='--list --print NONE' STDIN='' STDOUT='' STDERR='!USAGE!' EXITVAL='1' 07070100000027000081A4000003E8000000640000000161830F5E0000004C000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-005.tstFLAGS='--print NONE --help' STDIN='' STDOUT='' STDERR='!USAGE!' EXITVAL='1' 07070100000028000081A4000003E8000000640000000161830F5E00000042000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-006.tstFLAGS='-d --list' STDIN='' STDOUT='' STDERR='!USAGE!' EXITVAL='1' 07070100000029000081A4000003E8000000640000000161830F5E0000003D000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-007.tstFLAGS='-sdu' STDIN='' STDOUT='' STDERR='!USAGE!' EXITVAL='1' 0707010000002A000081A4000003E8000000640000000161830F5E00000049000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-008.tstFLAGS='--detect foo bar' STDIN='' STDOUT='' STDERR='!USAGE!' EXITVAL='1' 07070100000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000B00000000TRAILER!!!114 blocks
Locations
Projects
Search
Status Monitor
Help
OpenBuildService.org
Documentation
API Documentation
Code of Conduct
Contact
Support
@OBShq
Terms
openSUSE Build Service is sponsored by
The Open Build Service is an
openSUSE project
.
Sign Up
Log In
Places
Places
All Projects
Status Monitor