Skip to content
This repository was archived by the owner on Feb 15, 2023. It is now read-only.
This repository was archived by the owner on Feb 15, 2023. It is now read-only.

Python interface seems to not work. #356

@fake-name

Description

@fake-name

I'm trying to get gumbo to work with python on ubuntu 14.04, and not having much work.

I built and installed gumbo by cloning the master branch:

durr@bigsrv:~/gumbo-parser⟫ ./autogen.sh
+ libtoolize
libtoolize: putting auxiliary files in `.'.
libtoolize: linking file `./ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'.
libtoolize: linking file `m4/libtool.m4'
libtoolize: linking file `m4/ltoptions.m4'
libtoolize: linking file `m4/ltsugar.m4'
libtoolize: linking file `m4/ltversion.m4'
libtoolize: linking file `m4/lt~obsolete.m4'
+ aclocal -I m4
+ autoconf
+ automake --add-missing
configure.ac:13: installing './compile'
configure.ac:33: installing './config.guess'
configure.ac:33: installing './config.sub'
configure.ac:31: installing './install-sh'
configure.ac:31: installing './missing'
Makefile.am: installing './depcomp'
parallel-tests: installing './test-driver'
durr@bigsrv:~/gumbo-parser⟫ ./configure
checking for g++... g++
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for gcc... gcc
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking for gcc option to accept ISO C99... -std=gnu99
checking how to run the C preprocessor... gcc -std=gnu99 -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking stddef.h usability... yes
checking stddef.h presence... yes
checking for stddef.h... yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking for strings.h... (cached) yes
checking for inline... inline
checking for size_t... yes
checking for main in -lgtest_main... no
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
/bin/bash: /home/durr/missing: No such file or directory
configure: WARNING: 'missing' script is too old or missing
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking for style of include used by make... GNU
checking whether make supports nested variables... yes
checking dependency style of gcc -std=gnu99... gcc3
checking dependency style of g++... gcc3
checking whether make supports nested variables... (cached) yes
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking how to print strings... printf
checking for a sed that does not truncate output... /bin/sed
checking for fgrep... /bin/grep -F
checking for ld used by gcc -std=gnu99... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1572864
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to convert x86_64-unknown-linux-gnu file names to x86_64-unknown-linux-gnu format... func_convert_file_noop
checking how to convert x86_64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for ar... ar
checking for archiver @FILE support... @
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc -std=gnu99 object... ok
checking for sysroot... no
checking for mt... mt
checking if mt is a manifest tool... no
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc -std=gnu99 supports -fno-rtti -fno-exceptions... no
checking for gcc -std=gnu99 option to produce PIC... -fPIC -DPIC
checking if gcc -std=gnu99 PIC flag -fPIC -DPIC works... yes
checking if gcc -std=gnu99 static flag -static works... yes
checking if gcc -std=gnu99 supports -c -o file.o... yes
checking if gcc -std=gnu99 supports -c -o file.o... (cached) yes
checking whether the gcc -std=gnu99 linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... yes
checking how to run the C++ preprocessor... g++ -E
checking for ld used by g++... /usr/bin/ld -m elf_x86_64
checking if the linker (/usr/bin/ld -m elf_x86_64) is GNU ld... yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking for g++ option to produce PIC... -fPIC -DPIC
checking if g++ PIC flag -fPIC -DPIC works... yes
checking if g++ static flag -static works... yes
checking if g++ supports -c -o file.o... yes
checking if g++ supports -c -o file.o... (cached) yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking dynamic linker characteristics... (cached) GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating gumbo.pc
config.status: executing depfiles commands
config.status: executing libtool commands
durr@bigsrv:~/gumbo-parser⟫ make
  CC       src/libgumbo_la-attribute.lo
  CC       src/libgumbo_la-char_ref.lo
  CC       src/libgumbo_la-error.lo
  CC       src/libgumbo_la-parser.lo
  CC       src/libgumbo_la-string_buffer.lo
  CC       src/libgumbo_la-string_piece.lo
  CC       src/libgumbo_la-tag.lo
  CC       src/libgumbo_la-tokenizer.lo
  CC       src/libgumbo_la-utf8.lo
  CC       src/libgumbo_la-util.lo
  CC       src/libgumbo_la-vector.lo
  CCLD     libgumbo.la
  CXX      examples/clean_text.o
  CXXLD    clean_text
  CXX      examples/find_links.o
  CXXLD    find_links
  CC       examples/get_title.o
  CCLD     get_title
  CXX      examples/positions_of_class.o
  CXXLD    positions_of_class
  CXX      benchmarks/benchmark.o
  CXXLD    benchmark
  CXX      examples/serialize.o
  CXXLD    serialize
  CXX      examples/prettyprint.o
  CXXLD    prettyprint
durr@bigsrv:~/gumbo-parser⟫ sudo make install
[sudo] password for durr:
make[1]: Entering directory `/home/durr/gumbo-parser'
 /bin/mkdir -p '/usr/local/lib'
 /bin/bash ./libtool   --mode=install /usr/bin/install -c   libgumbo.la '/usr/local/lib'
libtool: install: /usr/bin/install -c .libs/libgumbo.so.1.0.0 /usr/local/lib/libgumbo.so.1.0.0
libtool: install: (cd /usr/local/lib && { ln -s -f libgumbo.so.1.0.0 libgumbo.so.1 || { rm -f libgumbo.so.1 && ln -s libgumbo.so.1.0.0 libgumbo.so.1; }; })
libtool: install: (cd /usr/local/lib && { ln -s -f libgumbo.so.1.0.0 libgumbo.so || { rm -f libgumbo.so && ln -s libgumbo.so.1.0.0 libgumbo.so; }; })
libtool: install: /usr/bin/install -c .libs/libgumbo.lai /usr/local/lib/libgumbo.la
libtool: install: /usr/bin/install -c .libs/libgumbo.a /usr/local/lib/libgumbo.a
libtool: install: chmod 644 /usr/local/lib/libgumbo.a
libtool: install: ranlib /usr/local/lib/libgumbo.a
libtool: finish: PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/sbin" ldconfig -n /usr/local/lib
----------------------------------------------------------------------
Libraries have been installed in:
   /usr/local/lib

If you ever happen to want to link against installed libraries
in a given directory, LIBDIR, you must either use libtool, and
specify the full pathname of the library, or use the `-LLIBDIR'
flag during linking and do at least one of the following:
   - add LIBDIR to the `LD_LIBRARY_PATH' environment variable
     during execution
   - add LIBDIR to the `LD_RUN_PATH' environment variable
     during linking
   - use the `-Wl,-rpath -Wl,LIBDIR' linker flag
   - have your system administrator add LIBDIR to `/etc/ld.so.conf'

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
----------------------------------------------------------------------
 /bin/mkdir -p '/usr/local/include'
 /usr/bin/install -c -m 644 src/gumbo.h src/tag_enum.h '/usr/local/include'
 /bin/mkdir -p '/usr/local/lib/pkgconfig'
 /usr/bin/install -c -m 644 gumbo.pc '/usr/local/lib/pkgconfig'
make[1]: Leaving directory `/home/durr/gumbo-parser'

And then the python extensions:

durr@bigsrv:~/gumbo-parser⟫ sudo python setup.py install
running install
running bdist_egg
running egg_info
writing python/gumbo.egg-info/PKG-INFO
writing top-level names to python/gumbo.egg-info/top_level.txt
writing dependency_links to python/gumbo.egg-info/dependency_links.txt
writing pbr to python/gumbo.egg-info/pbr.json
reading manifest file 'python/gumbo.egg-info/SOURCES.txt'
writing manifest file 'python/gumbo.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/gumbo
copying python/gumbo/html5lib_adapter.py -> build/lib.linux-x86_64-2.7/gumbo
copying python/gumbo/gumboc.py -> build/lib.linux-x86_64-2.7/gumbo
copying python/gumbo/soup_adapter.py -> build/lib.linux-x86_64-2.7/gumbo
copying python/gumbo/__init__.py -> build/lib.linux-x86_64-2.7/gumbo
copying python/gumbo/html5lib_adapter_test.py -> build/lib.linux-x86_64-2.7/gumbo
copying python/gumbo/gumboc_tags.py -> build/lib.linux-x86_64-2.7/gumbo
copying python/gumbo/soup_adapter_test.py -> build/lib.linux-x86_64-2.7/gumbo
copying python/gumbo/gumboc_test.py -> build/lib.linux-x86_64-2.7/gumbo
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/gumbo
copying build/lib.linux-x86_64-2.7/gumbo/html5lib_adapter.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib.linux-x86_64-2.7/gumbo/gumboc.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib.linux-x86_64-2.7/gumbo/soup_adapter.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib.linux-x86_64-2.7/gumbo/__init__.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib.linux-x86_64-2.7/gumbo/html5lib_adapter_test.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib.linux-x86_64-2.7/gumbo/gumboc_tags.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib.linux-x86_64-2.7/gumbo/soup_adapter_test.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib.linux-x86_64-2.7/gumbo/gumboc_test.py -> build/bdist.linux-x86_64/egg/gumbo
byte-compiling build/bdist.linux-x86_64/egg/gumbo/html5lib_adapter.py to html5lib_adapter.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/gumboc.py to gumboc.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/soup_adapter.py to soup_adapter.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/__init__.py to __init__.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/html5lib_adapter_test.py to html5lib_adapter_test.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/gumboc_tags.py to gumboc_tags.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/soup_adapter_test.py to soup_adapter_test.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/gumboc_test.py to gumboc_test.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/not-zip-safe -> build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/pbr.json -> build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
creating 'dist/gumbo-0.10.1-py2.7.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing gumbo-0.10.1-py2.7.egg
creating /usr/local/lib/python2.7/dist-packages/gumbo-0.10.1-py2.7.egg
Extracting gumbo-0.10.1-py2.7.egg to /usr/local/lib/python2.7/dist-packages
Adding gumbo 0.10.1 to easy-install.pth file

Installed /usr/local/lib/python2.7/dist-packages/gumbo-0.10.1-py2.7.egg
Processing dependencies for gumbo==0.10.1
Finished processing dependencies for gumbo==0.10.1
durr@bigsrv:~/gumbo-parser⟫ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import gumbo
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/gumbo-0.10.1-py2.7.egg/gumbo/__init__.py", line 33, in <module>
    from gumbo.gumboc import *
  File "/usr/local/lib/python2.7/dist-packages/gumbo-0.10.1-py2.7.egg/gumbo/gumboc.py", line 44, in <module>
    os.path.dirname(__file__), _name_of_lib))
  File "/usr/lib/python2.7/ctypes/__init__.py", line 443, in LoadLibrary
    return self._dlltype(name)
  File "/usr/lib/python2.7/ctypes/__init__.py", line 365, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/local/lib/python2.7/dist-packages/gumbo-0.10.1-py2.7.egg/gumbo/libgumbo.so: cannot open shared object file: No such file or directory
>>>

On python 3:

durr@bigsrv:~/gumbo-parser⟫ sudo python3 setup.py install
running install
Checking .pth file support in /usr/local/lib/python3.4/dist-packages/
/usr/bin/python3 -E -c pass
TEST PASSED: /usr/local/lib/python3.4/dist-packages/ appears to support .pth files
running bdist_egg
running egg_info
writing dependency_links to python/gumbo.egg-info/dependency_links.txt
writing python/gumbo.egg-info/PKG-INFO
writing top-level names to python/gumbo.egg-info/top_level.txt
reading manifest file 'python/gumbo.egg-info/SOURCES.txt'
writing manifest file 'python/gumbo.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/gumbo
copying build/lib/gumbo/html5lib_adapter.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib/gumbo/gumboc.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib/gumbo/soup_adapter.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib/gumbo/__init__.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib/gumbo/html5lib_adapter_test.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib/gumbo/gumboc_tags.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib/gumbo/soup_adapter_test.py -> build/bdist.linux-x86_64/egg/gumbo
copying build/lib/gumbo/gumboc_test.py -> build/bdist.linux-x86_64/egg/gumbo
byte-compiling build/bdist.linux-x86_64/egg/gumbo/html5lib_adapter.py to html5lib_adapter.cpython-34.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/gumboc.py to gumboc.cpython-34.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/soup_adapter.py to soup_adapter.cpython-34.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/__init__.py to __init__.cpython-34.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/html5lib_adapter_test.py to html5lib_adapter_test.cpython-34.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/gumboc_tags.py to gumboc_tags.cpython-34.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/soup_adapter_test.py to soup_adapter_test.cpython-34.pyc
byte-compiling build/bdist.linux-x86_64/egg/gumbo/gumboc_test.py to gumboc_test.cpython-34.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/not-zip-safe -> build/bdist.linux-x86_64/egg/EGG-INFO
copying python/gumbo.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
creating 'dist/gumbo-0.10.1-py3.4.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing gumbo-0.10.1-py3.4.egg
creating /usr/local/lib/python3.4/dist-packages/gumbo-0.10.1-py3.4.egg
Extracting gumbo-0.10.1-py3.4.egg to /usr/local/lib/python3.4/dist-packages
Adding gumbo 0.10.1 to easy-install.pth file

Installed /usr/local/lib/python3.4/dist-packages/gumbo-0.10.1-py3.4.egg
Processing dependencies for gumbo==0.10.1
Finished processing dependencies for gumbo==0.10.1


durr@bigsrv:~⟫ python3
Python 3.4.3 (default, Oct 14 2015, 20:28:29)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import gumbo
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.4/dist-packages/gumbo-0.10.1-py3.4.egg/gumbo/__init__.py", line 33, in <module>
    from gumbo.gumboc import *
  File "/usr/local/lib/python3.4/dist-packages/gumbo-0.10.1-py3.4.egg/gumbo/gumboc.py", line 29, in <module>
    import gumboc_tags
ImportError: No module named 'gumboc_tags'
>>>

{{{ moved array to fix that import issue }}}

>>> import gumbo
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/gumbo-0.10.1-py3.4.egg/gumbo/gumboc.py", line 198, in <module>
    os.path.dirname(__file__), '..', '..', '.libs', _name_of_lib))
  File "/usr/lib/python3.4/ctypes/__init__.py", line 429, in LoadLibrary
    return self._dlltype(name)
  File "/usr/lib/python3.4/ctypes/__init__.py", line 351, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/local/lib/python3.4/dist-packages/gumbo-0.10.1-py3.4.egg/gumbo/../../.libs/libgumbo.so: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.4/dist-packages/gumbo-0.10.1-py3.4.egg/gumbo/__init__.py", line 33, in <module>
    from gumbo.gumboc import *
  File "/usr/local/lib/python3.4/dist-packages/gumbo-0.10.1-py3.4.egg/gumbo/gumboc.py", line 202, in <module>
    os.path.dirname(__file__), _name_of_lib))
  File "/usr/lib/python3.4/ctypes/__init__.py", line 429, in LoadLibrary
    return self._dlltype(name)
  File "/usr/lib/python3.4/ctypes/__init__.py", line 351, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/local/lib/python3.4/dist-packages/gumbo-0.10.1-py3.4.egg/gumbo/libgumbo.so: cannot open shared object file: No such file or directory


I patched the import to gumboc_tags by just copying the contents of that file (it's just a single big array) into gumboc.py, then fixed the library search path issue (I just hardcoded the library path to "/usr/local/lib/libgumbo.so.1.0.0"), and it then imports, but gumbo.soup_parse (which is what I want) doesn't seem to be present:

>>> import gumbo
>>> gumbo
<module 'gumbo' from '/usr/local/lib/python3.4/dist-packages/gumbo-0.10.1-py3.4.egg/gumbo/__init__.py'>
>>> gumbo.soup_parse
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'soup_parse'
>>>

I also attempted to see if the version in PyPi would work, and it's non-functional after install for python3 (my app is python 3, I tested python 2 just to be thorough).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions