Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement shell output format #1645

Merged
merged 6 commits into from May 4, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Expand Up @@ -12,7 +12,7 @@

# Documentation

The documentation is a bit of a mixed bag (sorry in advanced, I do plan on simplifying it...) - with some parts automatically generated and stiched together and some statically defined.
The documentation is a bit of a mixed bag (sorry in advance, I do plan on simplifying it...) - with some parts automatically generated and stiched together and some statically defined.

Documentation is written in markdown, and is published in the 'gitbook' branch.

Expand Down
2 changes: 2 additions & 0 deletions cmd/utils.go
Expand Up @@ -195,6 +195,8 @@ func createEncoder(format yqlib.PrinterOutputFormat) (yqlib.Encoder, error) {
return yqlib.NewXMLEncoder(indent, yqlib.ConfiguredXMLPreferences), nil
case yqlib.TomlOutputFormat:
return yqlib.NewTomlEncoder(), nil
case yqlib.ShellVariablesOutputFormat:
return yqlib.NewShellVariablesEncoder(), nil
}
return nil, fmt.Errorf("invalid encoder: %v", format)
}
Expand Down
86 changes: 86 additions & 0 deletions pkg/yqlib/doc/usage/shellvariables.md
@@ -0,0 +1,86 @@

## Encode shell variables
Note that comments are dropped and values will be enclosed in single quotes as needed.

Given a sample.yml file of:
```yaml
# comment
name: Mike Wazowski
eyes:
color: turquoise
number: 1
friends:
- James P. Sullivan
- Celia Mae
```
then
```bash
yq -o=shell sample.yml
```
will output
```sh
name='Mike Wazowski'
eyes_color=turquoise
eyes_number=1
friends_0='James P. Sullivan'
friends_1='Celia Mae'
```

## Encode shell variables: illegal variable names as key.
Keys that would be illegal as variable keys are adapted.

Given a sample.yml file of:
```yaml
ascii_=_symbols: replaced with _
"ascii_ _controls": dropped (this example uses \t)
nonascii_讗_characters: dropped
effrot_expe帽ded_t貌_preserve_accented_latin_letters: moderate (via unicode NFKD)

```
then
```bash
yq -o=shell sample.yml
```
will output
```sh
ascii___symbols='replaced with _'
ascii__controls='dropped (this example uses \t)'
nonascii__characters=dropped
effrot_expended_to_preserve_accented_latin_letters='moderate (via unicode NFKD)'
```

## Encode shell variables: empty values, arrays and maps
Empty values are encoded to empty variables, but empty arrays and maps are skipped.

Given a sample.yml file of:
```yaml
empty:
value:
array: []
map: {}
```
then
```bash
yq -o=shell sample.yml
```
will output
```sh
empty_value=
```

## Encode shell variables: single quotes in values
Single quotes in values are encoded as '"'"' (close single quote, double-quoted single quote, open single quote).

Given a sample.yml file of:
```yaml
name: Miles O'Brien
```
then
```bash
yq -o=shell sample.yml
```
will output
```sh
name='Miles O'"'"'Brien'
```

153 changes: 153 additions & 0 deletions pkg/yqlib/encoder_shellvariables.go
@@ -0,0 +1,153 @@
package yqlib

import (
"fmt"
"io"
"strings"
"unicode/utf8"

"golang.org/x/text/unicode/norm"
yaml "gopkg.in/yaml.v3"
)

type shellVariablesEncoder struct {
}

func NewShellVariablesEncoder() Encoder {
return &shellVariablesEncoder{}
}

func (pe *shellVariablesEncoder) CanHandleAliases() bool {
return false
}

func (pe *shellVariablesEncoder) PrintDocumentSeparator(_ io.Writer) error {
return nil
}

func (pe *shellVariablesEncoder) PrintLeadingContent(_ io.Writer, _ string) error {
return nil
}

func (pe *shellVariablesEncoder) Encode(writer io.Writer, node *yaml.Node) error {

mapKeysToStrings(node)
err := pe.doEncode(&writer, node, "", nil)
if err != nil {
return err
}

return err
}

func (pe *shellVariablesEncoder) doEncode(w *io.Writer, node *yaml.Node, path string, _ *yaml.Node) error {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you have the last argument here if it's just ignored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's vestigial remains from the properties encoder that I initially cop-ypasted - good catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with 8902f98


// Note this drops all comments.

switch node.Kind {
case yaml.ScalarNode:
nonemptyPath := path
if path == "" {
// We can't assign an empty variable "=somevalue" because that would error out if sourced in a shell,
// nor can we use "_" as a variable name ($_ is a special shell variable that can't be assigned)...
// let's just pick a fallback key to use if we are encoding a single scalar
nonemptyPath = "value"
}
_, err := io.WriteString(*w, nonemptyPath+"="+quoteValue(node.Value)+"\n")
return err
case yaml.DocumentNode:
return pe.doEncode(w, node.Content[0], path, node)
case yaml.SequenceNode:
for index, child := range node.Content {
err := pe.doEncode(w, child, appendPath(path, index), nil)
if err != nil {
return err
}
}
return nil
case yaml.MappingNode:
for index := 0; index < len(node.Content); index = index + 2 {
key := node.Content[index]
value := node.Content[index+1]
err := pe.doEncode(w, value, appendPath(path, key.Value), key)
if err != nil {
return err
}
}
return nil
case yaml.AliasNode:
return pe.doEncode(w, node.Alias, path, nil)
default:
return fmt.Errorf("Unsupported node %v", node.Tag)
}
}

func appendPath(cookedPath string, rawKey interface{}) string {

// Shell variable names must match
// [a-zA-Z_]+[a-zA-Z0-9_]*
//
// While this is not mandated by POSIX, which is quite lenient, it is
// what shells (for example busybox ash *) allow in practice.
//
// Since yaml names can contain basically any character, we will process them according to these steps:
//
// 1. apply unicode compatibility decomposition NFKD (this will convert accented
// letters to letters followed by accents, split ligatures, replace exponents
// with the corresponding digit, etc.
//
// 2. discard non-ASCII characters as well as ASCII control characters (ie. anything
// with code point < 32 or > 126), this will eg. discard accents but keep the base
// unaccented letter because of NFKD above
//
// 3. replace all non-alphanumeric characters with _
//
// Moreover, for the root key only, we will prepend an underscore if what results from the steps above
// does not start with [a-zA-Z_] (ie. if the root key starts with a digit).
//
// Note this is NOT a 1:1 mapping.
//
// (*) see endofname.c from https://git.busybox.net/busybox/tag/?h=1_36_0

// XXX empty strings
giorgiga marked this conversation as resolved.
Show resolved Hide resolved

key := strings.Map(func(r rune) rune {
if isAlphaNumericOrUnderscore(r) {
return r
} else if r < 32 || 126 < r {
return -1
}
return '_'
}, norm.NFKD.String(fmt.Sprintf("%v", rawKey)))

if cookedPath == "" {
firstRune, _ := utf8.DecodeRuneInString(key)
if !isAlphaOrUnderscore(firstRune) {
return "_" + key
}
return key
}
return cookedPath + "_" + key
}

func quoteValue(value string) string {
needsQuoting := false
for _, r := range value {
if !isAlphaNumericOrUnderscore(r) {
needsQuoting = true
break
}
}
if needsQuoting {
return "'" + strings.ReplaceAll(value, "'", "'\"'\"'") + "'"
}
return value
}

func isAlphaOrUnderscore(r rune) bool {
return ('a' <= r && r <= 'z') || ('A' <= r && r <= 'Z')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this need || r == '_' ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function was initially just isAlpha... I must have thought renaming was all it needed to change the implementation!

I'm adding a test case case for this... well, for when the root key starts with an underscore, which is why this function should allow _.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with 8902f98

}

func isAlphaNumericOrUnderscore(r rune) bool {
return isAlphaOrUnderscore(r) || ('0' <= r && r <= '9')
}
85 changes: 85 additions & 0 deletions pkg/yqlib/encoder_shellvariables_test.go
@@ -0,0 +1,85 @@
package yqlib

import (
"bufio"
"bytes"
"strings"
"testing"

"github.com/mikefarah/yq/v4/test"
)

func assertEncodesTo(t *testing.T, yaml string, shellvars string) {
var output bytes.Buffer
writer := bufio.NewWriter(&output)

var encoder = NewShellVariablesEncoder()
inputs, err := readDocuments(strings.NewReader(yaml), "test.yml", 0, NewYamlDecoder(ConfiguredYamlPreferences))
if err != nil {
panic(err)
}
node := inputs.Front().Value.(*CandidateNode).Node
err = encoder.Encode(writer, node)
if err != nil {
panic(err)
}
writer.Flush()

test.AssertResult(t, shellvars, strings.TrimSuffix(output.String(), "\n"))
}

func TestShellVariablesEncoderNonquoting(t *testing.T) {
assertEncodesTo(t, "a: alice", "a=alice")
}

func TestShellVariablesEncoderQuoting(t *testing.T) {
assertEncodesTo(t, "a: Lewis Carroll", "a='Lewis Carroll'")
}

func TestShellVariablesEncoderQuotesQuoting(t *testing.T) {
assertEncodesTo(t, "a: Lewis Carroll's Alice", "a='Lewis Carroll'\"'\"'s Alice'")
}

func TestShellVariablesEncoderStripComments(t *testing.T) {
assertEncodesTo(t, "a: Alice # comment", "a=Alice")
}

func TestShellVariablesEncoderMap(t *testing.T) {
assertEncodesTo(t, "a:\n b: Lewis\n c: Carroll", "a_b=Lewis\na_c=Carroll")
}

func TestShellVariablesEncoderArray_Unwrapped(t *testing.T) {
assertEncodesTo(t, "a: [{n: Alice}, {n: Bob}]", "a_0_n=Alice\na_1_n=Bob")
}

func TestShellVariablesEncoderKeyNonPrintable(t *testing.T) {
assertEncodesTo(t, `"be\all": ring!`, "bell='ring!'")
}

func TestShellVariablesEncoderKeyPrintableNonAlphaNumeric(t *testing.T) {
assertEncodesTo(t, `"b-e l=l": ring!`, "b_e_l_l='ring!'")
}

func TestShellVariablesEncoderKeyPrintableNonAscii(t *testing.T) {
assertEncodesTo(t, `"b\u00e9ll": ring!`, "bell='ring!'")
}

func TestShellVariablesEncoderRootKeyStartingWithDigit(t *testing.T) {
assertEncodesTo(t, "1a: onea", "_1a=onea")
}

func TestShellVariablesEncoderEmptyValue(t *testing.T) {
assertEncodesTo(t, "empty:", "empty=")
}

func TestShellVariablesEncoderEmptyArray(t *testing.T) {
assertEncodesTo(t, "empty: []", "")
}

func TestShellVariablesEncoderEmptyMap(t *testing.T) {
assertEncodesTo(t, "empty: {}", "")
}

func TestShellVariablesEncoderScalarNode(t *testing.T) {
assertEncodesTo(t, "some string", "value='some string'")
}
5 changes: 4 additions & 1 deletion pkg/yqlib/printer.go
Expand Up @@ -32,6 +32,7 @@ const (
UriOutputFormat
ShOutputFormat
TomlOutputFormat
ShellVariablesOutputFormat
)

func OutputFormatFromString(format string) (PrinterOutputFormat, error) {
Expand All @@ -50,8 +51,10 @@ func OutputFormatFromString(format string) (PrinterOutputFormat, error) {
return XMLOutputFormat, nil
case "toml":
return TomlOutputFormat, nil
case "shell", "s", "sh":
return ShellVariablesOutputFormat, nil
default:
return 0, fmt.Errorf("unknown format '%v' please use [yaml|json|props|csv|tsv|xml]", format)
return 0, fmt.Errorf("unknown format '%v' please use [yaml|json|props|csv|tsv|xml|toml|shell]", format)
}
}

Expand Down