Discussion:
Null character in replacement
(too old to reply)
Zachary Vance
2014-10-23 08:49:10 UTC
Permalink
sed -e 's/^$/\x00/' replaces empty lines with the a line containing the
null byte.

sed -e "s/^$/\0/" (in bash, an s// expression containing a literal zero
byte) performs no replacements on the stream.

I am not sure if this is simply user error (including error understanding
how bash parses strings before it reaches sed), but I thought I'd report it
in case it was a parsing bug.
Bob Proulx
2014-10-23 18:32:59 UTC
Permalink
Post by Zachary Vance
sed -e 's/^$/\x00/' replaces empty lines with the a line containing the
null byte.
$ echo "" | sed 's/^$/\x00/' | od -tx1 -c
0000000 00 0a
\0 \n
Post by Zachary Vance
sed -e "s/^$/\0/" (in bash, an s// expression containing a literal zero
byte) performs no replacements on the stream.
It does here.

bash$ echo "" | sed "s/^$/\0/" | od -tx1 -c
0000000 0a
\n
Post by Zachary Vance
I am not sure if this is simply user error (including error understanding
how bash parses strings before it reaches sed), but I thought I'd report it
in case it was a parsing bug.
The "\0" in bash does not create a literal zero byte.

bash$ echo "s/^$/\0/" | od -tx1 -c
0000000 73 2f 5e 24 2f 5c 30 2f 0a
s / ^ $ / \ 0 / \n

Additionally you should quote the "$/" expansion for safety. Works
but isn't safe. Using single quotes to avoid $ expansion is better as
you know from your other example.

Additionally \0 is a regular expression back reference.

Your substitution works for me. With the definition of works that it
does what I expect it to do.

bash$ echo "" | sed "s/^\$/\0/" | od -tx1 -c
0000000 0a
\n

bash$ echo "" | sed 's/^$/\0/' | od -tx1 -c
0000000 0a
\n

The ^$ matches the empty line. The \0 backreference is empty because
there wasn't anything to reference and therefore the empty back
reference is used in the replacement. Perhaps these examples will
illustrate backreferences enough.

$ echo abc | sed 's/\(b\)/\1\1\1\1/'
abbbbc

$ echo abc | sed 's/.*\(b\).*/\1\1\1/'
bbb

$ echo abc | sed 's/.*\(b\).*/\0\0\0/'
abcabcabc

If you want to replace it with a literal backslash zero then you would
need to escape the backslash.

bash$ echo "" | sed 's/^$/\\0/' | od -tx1 -c
0000000 5c 30 0a
\ 0 \n

If that is done inside double quotes then there needs to be one set of
escaping for the shell expansion inside the double quotes and then
another set for sed's interpretation.

bash$ echo "" | sed "s/^\$/\\\\0/" | od -tx1 -c
0000000 5c 30 0a
\ 0 \n

Hope that helps,
Bob
Andreas Schwab
2014-10-24 13:13:27 UTC
Permalink
Post by Zachary Vance
sed -e "s/^$/\0/" (in bash, an s// expression containing a literal zero
byte) performs no replacements on the stream.
A command line argument cannot contain a literal zero byte, since
argument strings are zero-terminated, and you would get an error about
an unterminated command.

$ sed -e $'s/^$/\0/'
sed: -e expression #1, char 5: unterminated `s' command

Andreas.
--
Andreas Schwab, ***@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Zachary Vance
2014-10-25 04:02:23 UTC
Permalink
Andreas: That's extremely helpful to learn. Thanks.
Post by Andreas Schwab
Post by Zachary Vance
sed -e "s/^$/\0/" (in bash, an s// expression containing a literal zero
byte) performs no replacements on the stream.
A command line argument cannot contain a literal zero byte, since
argument strings are zero-terminated, and you would get an error about
an unterminated command.
$ sed -e $'s/^$/\0/'
sed: -e expression #1, char 5: unterminated `s' command
Andreas.
--
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Loading...