Skip to content

Parsing scripts with arrays #84

@samlikins

Description

@samlikins

Attempting to parse a script with array declaration fails upon encountering the opening set mark (ie: ().

The following bashlexinformation was provided by pip:

$ pip show bashlex
Name: bashlex
Version: 0.18
Summary: Python parser for bash
Home-page: https://github.com/idank/bashlex.git
Author: Idan Kamara
Author-email: i@idank.me
License: GPLv3+
Location: /home/user/.local/lib/python3.10/site-packages
Requires:
Required-by:

In a Python interactive session with the following setup:

Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bashlex

Running the bashlex.parse function with the string declare -a CMDS=() produces the following output:

>>> bashlex.parse('declare -a CMDS=()')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/.local/lib/python3.10/site-packages/bashlex/parser.py", line 610, in parse
    parts = [p.parse()]
  File "/home/user/.local/lib/python3.10/site-packages/bashlex/parser.py", line 691, in parse
    tree = theparser.parse(lexer=self.tok, context=self)
  File "/home/user/.local/lib/python3.10/site-packages/bashlex/yacc.py", line 537, in parse
    tok = self.errorfunc(errtoken)
  File "/home/user/.local/lib/python3.10/site-packages/bashlex/parser.py", line 548, in p_error
    raise errors.ParsingError('unexpected token %r' % p.value,
bashlex.errors.ParsingError: unexpected token '(' (position 16)

When removing the round brackets it succeeds:

>>> bashlex.parse('declare -a CMDS')
[CommandNode(parts=[WordNode(parts=[] pos=(0, 7) word='declare'), WordNode(parts=[] pos=(8, 10) word='-a'), WordNode(parts=[] pos=(11, 15) word='CMDS')] pos=(0, 15))]

It's independent of the declare keyword:

>>> bashlex.parse('CMDS=()')
bashlex.errors.ParsingError: unexpected token '(' (position 5)

The error occurs when appending to the array as well:

>>> bashlex.parse('CMDS+=("init")')
bashlex.errors.ParsingError: unexpected token '(' (position 6)

Parsing parenthesis is not by itself the issue:

>>> bashlex.parse('(env)')
[CompoundNode(list=[ReservedwordNode(pos=(0, 1) word='('), CommandNode(parts=[WordNode(parts=[] pos=(1, 4) word='env')] pos=(1, 4)), ReservedwordNode(pos=(4, 5) word=')')] pos=(0, 5) redirects=[])]

The lexer seems to recognize arrays as WordNodes:

>>> bashlex.parse('ARRAY[1]=init')
[CommandNode(parts=[WordNode(parts=[] pos=(0, 13) word='ARRAY[1]=init')] pos=(0, 13))]
>>> bashlex.parse('echo ${ARRAY[*]}')
[CommandNode(parts=[WordNode(parts=[] pos=(0, 4) word='echo'), WordNode(parts=[ParameterNode(pos=(5, 16) value='ARRAY[*]')] pos=(5, 16) word='${ARRAY[*]}')] pos=(0, 16))]
>>> bashlex.parse('unset ARRAY[1]')
[CommandNode(parts=[WordNode(parts=[] pos=(0, 5) word='unset'), WordNode(parts=[] pos=(6, 14) word='ARRAY[1]')] pos=(0, 14))]

It just seems to have issues recognizing array sets when performing assignments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions