Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
207c5f0
Fix bug with FakeFileSystem
seddonym Dec 2, 2025
38d23e9
Correct wrong type being passed in test
seddonym Dec 3, 2025
4362eed
Switch module_files to use Set
seddonym Dec 4, 2025
c09464f
Prefactor test_walk
seddonym Dec 4, 2025
5a76f4e
Parse contents differently in FakeFileSystem
seddonym Dec 4, 2025
dd805dc
Add test cases to show better file system walking
seddonym Dec 4, 2025
11c2943
Prefactor test for adding more namespace tests
seddonym Dec 4, 2025
b851f83
Add deterministic ordering to dataclasses
seddonym Dec 4, 2025
cff77f7
Allow passing namespace packages to ModuleFinder
seddonym Dec 4, 2025
ba6e828
Include namespace_packages in FoundPackage
seddonym Dec 4, 2025
cc94114
Make package finder return multiple directories
seddonym Dec 2, 2025
143a531
Return all directories from namespace packages
seddonym Dec 3, 2025
b17128c
Adjust BaseFakePackageFinder to support multiple directories
seddonym Dec 3, 2025
53bc1c9
Allow passing in namespace packages to build_graph
seddonym Dec 4, 2025
3f11a66
Don't include directories that have no Python files within them
seddonym Dec 5, 2025
846b796
Include namespaces in graph
seddonym Dec 5, 2025
60fc3a2
Update changelog
seddonym Dec 8, 2025
3a11988
Expand test to include building graph from root namespace
seddonym Dec 9, 2025
a725e19
Don't drill down into invalid identifier directories
seddonym Dec 9, 2025
685f911
Include imports of namespace packages
seddonym Dec 10, 2025
c76a1d4
Add docs for better namespace support
seddonym Dec 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ latest

* Drop support for Python 3.9.
* Bugfix: don't treat t-strings as syntax errors. https://github.com/python-grimp/grimp/issues/268
* Support building graph from namespace packages, not just their portions.

3.13 (2025-10-29)
-----------------
Expand Down
31 changes: 24 additions & 7 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -114,20 +114,37 @@ You may analyse multiple root packages. To do this, pass each package name as a
Namespace packages
------------------

Graphs can also be built from `portions`_ of `namespace packages`_. To do this, provide the portion name, rather than the namespace name::

>>> graph = grimp.build_graph('somenamespace.foo')
Graphs can be built either from `namespace packages`_ or from their `portions`_.

What's a namespace package?
###########################

Namespace packages are a Python feature allows subpackages to be distributed independently, while still importable under a shared namespace. This is, for example, used by `the Python client for Google's Cloud Logging API`_. When installed, it is importable in Python as ``google.cloud.logging``. The parent packages ``google`` and ``google.cloud`` are both namespace packages, while ``google.cloud.logging`` is known as the 'portion'. Other portions in the same namespace can be installed separately, for example ``google.cloud.secretmanager``.
Namespace packages are a Python feature allows subpackages to be distributed independently, while
still importable under a shared namespace.

This is used by
`the Python client for Google's Cloud Logging API`_, for example. When installed, it is importable
in Python as ``google.cloud.logging``. The parent packages ``google`` and ``google.cloud`` are both namespace
packages, while ``google.cloud.logging`` is known as the 'portion'. Other portions in the same
namespace can be installed separately, for example ``google.cloud.secretmanager``.

Examples::

# In this one, the portion is supplied. Neither "google" nor "google.cloud"
# will appear in the graph.
>>> graph = grimp.build_graph("google.cloud.logging")

# In this one, a namespace is supplied.
# Neither "google" nor "google.cloud" will appear in the graph,
# as will other installed packages under the "google" namespace such
# as "google.auth".
>>> graph = grimp.build_graph("google")

Grimp expects the package name passed to ``build_graph`` to be a portion, rather than a namespace package. So in the case of the example above, the graph should be built like so::
# This one supplies a subnamespace of "google" - it will include
# "google.cloud.logging" and "google.cloud.secretmanager" but not "google.auth".
>>> graph = grimp.build_graph("google.cloud")

>>> graph = grimp.build_graph('google.cloud.logging')

If, instead, a namespace package is passed (e.g. ``grimp.build_graph('google.cloud')``), Grimp will raise ``NamespacePackageEncountered``.

.. _portions: https://docs.python.org/3/glossary.html#term-portion
.. _namespace packages: https://docs.python.org/3/glossary.html#term-namespace-package
Expand Down
9 changes: 5 additions & 4 deletions docs/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,19 +57,20 @@ Building the graph
Build and return an ImportGraph for the supplied package or packages.

:param str package_name: The name of an importable package, for example ``'mypackage'``. For regular packages, this
must be the top level package (i.e. one with no dots in its name). However, in the special case of
`namespace packages`_, the name of the *portion* should be supplied, for example ``'mynamespace.foo'``.
must be the top level package (i.e. one with no dots in its name). In the special case of
`namespace packages`_, the name of the *portion* may be supplied instead, for example ``'mynamespace.foo'``.
If the portion is supplied, its ancestor packages will not be included in the graph.
:param tuple[str, ...] additional_package_names: Tuple of any additional package names. These can be
supplied as positional arguments, as in the example above.
:param bool, optional include_external_packages: Whether to include external packages in the import graph. If this is ``True``,
any other top level packages (including packages in the standard library) that are imported by this package will
be included in the graph as squashed modules (see `Terminology`_ above).

The behaviour is more complex if one of the internal packages is a `namespace portion`_.
The behaviour is more complex if one of the specified packages is a `namespace portion`_.
In this case, the squashed module will have the shallowest name that doesn't clash with any internal modules.
For example, in a graph with internal packages ``namespace.foo`` and ``namespace.bar.one.green``,
``namespace.bar.one.orange.alpha`` would be added to the graph as ``namespace.bar.one.orange``. However, in a graph
with only ``namespace.foo`` as an internal package, the same external module would be added as
with only ``namespace.foo`` passed, the same external module would be added as
``namespace.bar``.

*Note: external packages are only analysed as modules that are imported; any imports they make themselves will
Expand Down
10 changes: 4 additions & 6 deletions rust/src/filesystem.rs
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@ fn parse_indented_file_system_string(file_system_string: &str) -> HashMap<String
let mut file_paths_map: HashMap<String, String> = HashMap::new();
let mut path_stack: Vec<String> = Vec::new(); // Stores current directory path components
let mut first_line = true; // Flag to handle the very first path component

let mut first_line_indent: usize = 0;
// Normalize newlines and split into lines
let buffer = file_system_string.replace("\r\n", "\n");
let lines: Vec<&str> = buffer.split('\n').collect();
Expand All @@ -334,27 +334,25 @@ fn parse_indented_file_system_string(file_system_string: &str) -> HashMap<String
if line.is_empty() {
continue; // Skip empty lines
}

let current_indent = line.chars().take_while(|&c| c.is_whitespace()).count();
let current_indent =
line.chars().take_while(|&c| c.is_whitespace()).count() - first_line_indent;
let trimmed_line = line.trim_start();

// Assuming 4 spaces per indentation level for calculating depth
// Adjust this if your indentation standard is different (e.g., 2 spaces, tabs)
let current_depth = current_indent / 4;

if first_line {
// The first non-empty line sets the base path.
// It might be absolute (/a/b/) or relative (a/b/).
let root_component = trimmed_line.trim_end_matches('/').to_string();
path_stack.push(root_component);
first_line = false;
first_line_indent = current_indent;
} else {
// Adjust the path_stack based on indentation level
// Pop elements from the stack until we reach the correct parent directory depth
while path_stack.len() > current_depth {
path_stack.pop();
}

// If the current line is a file, append it to the path for inserting into map,
// then pop it off so that subsequent siblings are correctly handled.
// If it's a directory, append it and it stays on the stack for its children.
Expand Down
3 changes: 3 additions & 0 deletions rust/src/import_scanning.rs
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ fn get_modules_from_found_packages(found_packages: &HashSet<FoundPackage>) -> Ha
for module_file in &package.module_files {
modules.insert(module_file.module.clone());
}
for namespace_module in &package.namespace_packages {
modules.insert(namespace_module.clone());
}
}
modules
}
Expand Down
21 changes: 19 additions & 2 deletions rust/src/module_finding.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ pub struct FoundPackage {
pub directory: String,
// BTreeSet rather than HashSet is necessary to make FoundPackage hashable.
pub module_files: BTreeSet<ModuleFile>,
pub namespace_packages: BTreeSet<Module>,
}

/// Implements conversion from a Python 'FoundPackage' object to the Rust 'FoundPackage' struct.
Expand All @@ -41,20 +42,36 @@ impl<'py> FromPyObject<'py> for FoundPackage {
// Access the 'module_files' attribute.
let module_files_py = ob.getattr("module_files")?;
// Downcast the PyAny object to a PyFrozenSet, as Python 'FrozenSet' maps to 'PyFrozenSet'.
let py_frozen_set = module_files_py.downcast::<PyFrozenSet>()?;
let module_files_frozenset = module_files_py.downcast::<PyFrozenSet>()?;

let mut module_files = BTreeSet::new();
// Iterate over the Python frozenset.
for py_module_file_any in py_frozen_set.iter() {
for py_module_file_any in module_files_frozenset.iter() {
// Extract each element (PyAny) into a Rust 'ModuleFile'.
let module_file: ModuleFile = py_module_file_any.extract()?;
module_files.insert(module_file);
}

// Access the 'namespace_packages' attribute.
let namespace_packages_py = ob.getattr("namespace_packages")?;
// Downcast the PyAny object to a PyFrozenSet, as Python 'FrozenSet' maps to 'PyFrozenSet'.
let namespace_packages_frozenset = namespace_packages_py.downcast::<PyFrozenSet>()?;

let mut namespace_packages = BTreeSet::new();
// Iterate over the Python frozenset.
for py_namespace_any in namespace_packages_frozenset.iter() {
// Extract each element (PyAny) into a Rust 'ModuleFile'.
let namespace_package: String = py_namespace_any.extract()?;
namespace_packages.insert(Module {
name: namespace_package,
});
}

Ok(FoundPackage {
name,
directory,
module_files,
namespace_packages,
})
}
}
97 changes: 80 additions & 17 deletions src/grimp/adaptors/modulefinder.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import logging
from collections.abc import Iterable
from collections.abc import Iterable, Set

from grimp.application.ports import modulefinder
from grimp.application.ports.filesystem import AbstractFileSystem
Expand All @@ -16,7 +16,10 @@ def find_package(

module_files: list[modulefinder.ModuleFile] = []

for module_filename in self._get_python_files_inside_package(package_directory):
python_files, namespace_dirs = self._get_python_files_and_namespace_dirs_inside_package(
package_directory
)
for module_filename in python_files:
module_name = self._module_name_from_filename(
package_name, module_filename, package_directory
)
Expand All @@ -25,39 +28,82 @@ def find_package(
modulefinder.ModuleFile(module=Module(module_name), mtime=module_mtime)
)

namespace_packages = frozenset(
{
self._namespace_from_dir(package_name, namespace_dir, package_directory)
for namespace_dir in namespace_dirs
}
)

return modulefinder.FoundPackage(
name=package_name,
directory=package_directory,
module_files=frozenset(module_files),
namespace_packages=namespace_packages,
)

def _get_python_files_inside_package(self, directory: str) -> Iterable[str]:
def _get_python_files_and_namespace_dirs_inside_package(
self, directory: str
) -> tuple[Iterable[str], Set[str]]:
"""
Get a list of Python files within the supplied package directory.
Return:
Generator of Python file names.
Search the supplied package directory for Python files and namespaces.

Return tuple consisting of:
1. Iterable of Python file names.
2. Set of namespace directories encountered.
"""
python_files: list[str] = []
candidate_namespace_dirs: list[str] = []
portion_dirs: set[str] = set()

for dirpath, dirs, files in self.file_system.walk(directory):
# Don't include directories that aren't Python packages,
# nor their subdirectories.
if "__init__.py" not in files:
for d in list(dirs):
dirs.remove(d)
continue

# Don't include hidden directories.
if self._is_in_portion(dirpath, portion_dirs):
# Are we somewhere inside a non-namespace package?
if "__init__.py" not in files:
# Don't drill down further in this directory.
# (This means we won't include 'orphans' - Python packages deeply nested
# in a package that has already included __init__.py files.
for d in list(dirs):
dirs.remove(d)
continue
elif "__init__.py" in files:
# This directory is a portion (i.e. it has a top-level __init__.py).
portion_dirs.add(dirpath)
else:
# We don't yet know whether this is a namespace dir. It'll only be one if we find
# a Python file somewhere within it.
candidate_namespace_dirs.append(dirpath)

# Don't include directories that aren't valid identifiers.
dirs_to_remove = [d for d in dirs if self._should_ignore_dir(d)]
for d in dirs_to_remove:
dirs.remove(d)

for filename in files:
if self._is_python_file(filename, dirpath):
yield self.file_system.join(dirpath, filename)
python_files.append(self.file_system.join(dirpath, filename))

namespace_dirs = self._determine_namespace_dirs(candidate_namespace_dirs, python_files)
return python_files, namespace_dirs

def _is_in_portion(self, directory: str, portions: Set[str]) -> bool:
return any(directory.startswith(portion) for portion in portions)

def _should_ignore_dir(self, directory: str) -> bool:
# TODO: make this configurable.
# Skip adding directories that are hidden.
return directory.startswith(".")
return not directory.isidentifier()

def _determine_namespace_dirs(
self, candidates: Iterable[str], python_files: Iterable[str]
) -> set[str]:
namespace_dirs: set[str] = set()
for candidate in candidates:
candidate_with_trailing_sep = candidate + self.file_system.sep
for python_file in python_files:
if python_file.startswith(candidate_with_trailing_sep):
namespace_dirs.add(candidate)
break
return namespace_dirs

def _is_python_file(self, filename: str, dirpath: str) -> bool:
"""
Expand Down Expand Up @@ -107,3 +153,20 @@ def _module_name_from_filename(
if components[-1] == "__init__":
components.pop()
return ".".join(components)

def _namespace_from_dir(
self, package_name: str, namespace_dir: str, package_directory: str
) -> str:
"""
Args:
package_name (string) - the importable name of the top level package. Could
be namespaced.
namespace_dir (string) - the full name of the namespace directory.
package_directory (string) - the full path of the top level Python package directory.
Returns:
Absolute module name for importing (string).
"""
parent_of_package_directory = package_directory[: -len(package_name)]
directory_relative_to_parent = namespace_dir[len(parent_of_package_directory) :]
components = directory_relative_to_parent.split(self.file_system.sep)
return ".".join(components)
14 changes: 4 additions & 10 deletions src/grimp/adaptors/packagefinder.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,9 @@


class ImportLibPackageFinder(AbstractPackageFinder):
def determine_package_directory(
def determine_package_directories(
self, package_name: str, file_system: AbstractFileSystem
) -> str:
# TODO - do we need to add the current working directory here?
) -> set[str]:
# Attempt to locate the package file.
spec = importlib.util.find_spec(package_name)
if not spec:
Expand All @@ -26,13 +25,8 @@ def determine_package_directory(
if not self._is_a_package(spec, file_system) or self._has_a_non_namespace_parent(spec):
raise exceptions.NotATopLevelModule

return file_system.dirname(spec.origin)

raise exceptions.NamespacePackageEncountered(
f"Package '{package_name}' is a namespace package (see PEP 420). Try specifying the "
"portion name instead. If you are not intentionally using namespace packages, "
"adding an __init__.py file should fix the problem."
)
assert spec.submodule_search_locations # This should be the case if spec.has_location.
return set(spec.submodule_search_locations)

def _is_a_package(self, spec: ModuleSpec, file_system: AbstractFileSystem) -> bool:
assert spec.origin
Expand Down
8 changes: 5 additions & 3 deletions src/grimp/application/ports/modulefinder.py
Original file line number Diff line number Diff line change
@@ -1,26 +1,28 @@
import abc
from collections.abc import Set
from dataclasses import dataclass

from grimp.domain.valueobjects import Module

from .filesystem import AbstractFileSystem


@dataclass(frozen=True)
@dataclass(frozen=True, order=True)
class ModuleFile:
module: Module
mtime: float


@dataclass(frozen=True)
@dataclass(frozen=True, order=True)
class FoundPackage:
"""
Set of modules found under a single package, together with metadata.
"""

name: str
directory: str
module_files: frozenset[ModuleFile]
module_files: Set[ModuleFile]
namespace_packages: Set[str] = frozenset()


class AbstractModuleFinder(abc.ABC):
Expand Down
4 changes: 2 additions & 2 deletions src/grimp/application/ports/packagefinder.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

class AbstractPackageFinder(abc.ABC):
@abc.abstractmethod
def determine_package_directory(
def determine_package_directories(
self, package_name: str, file_system: AbstractFileSystem
) -> str:
) -> set[str]:
raise NotImplementedError
Loading
Loading