Skip to content

Conversation

@JoergBudi
Copy link
Contributor

basically 'renovated' stale PR #308 from collections-770, ensures that rechaining chainedIterators multiple times now result in a single IteratorChain instance with all picked up underlying iterators, thus avoiding deeply nested imperformant IteratorChain instances. Additionally the UnmodifiableIterator gets a special treatment, so that SetUtils.SetView.iterator() also benefits from this solution. Raised test coverage.

Note: From the comments in PR 308 (collections-770), I think, the PR stalled because it did not directly address the issue in that ticket, however there might be other reasons I am not aware of.
But as far as I can see, it fits perfectly for Collections-722. Cudos to original author stevebosman-oc.

Thanks for your contribution to Apache Commons! Your help is appreciated!

Before you push a pull request, review this list:

  • Read the contribution guidelines for this project.
  • [-] Read the ASF Generative Tooling Guidance if you use Artificial Intelligence (AI).
  • [-] I used AI to create any part of, or all of, this pull request.
  • Run a successful build using the default Maven goal with mvn; that's mvn on the command line by itself.
  • Write unit tests that match behavioral changes, where the tests fail if the changes to the runtime are not applied. This may not always be possible, but it is a best-practice.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Each commit in the pull request should have a meaningful subject line and body. Note that a maintainer may squash commits during the merge process.

…pull/308 from collections-770, ensures that rechaining chainedIterators multiple times now result in a single IteratorChain instance with all picked up underlying iterators, thus avoiding deeply nested imperformant IteratorChain instances. Additionally the UnmodifiableIterator gets a special treatment, so that SetUtils.SetView.iterator() also benefits from this solution. Raised test coverage.
Copy link
Member

@garydgregory garydgregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoergBudi Please see my comment in UnmodifiableIterator. TY!

…s-722-nested-chainedIterator

switched to new UmodifiableIterator.unwrap()
@JoergBudi
Copy link
Contributor Author

@garydgregory : Is there something blocking the merging of this PR ? As solutions for other issues (see discussion in collections-838 / PR 630) might depend on it, it would be useful when it gets merged when it suits you. You might also discuss with @ppkarwasz .

@JoergBudi
Copy link
Contributor Author

The recent #633 fixes the core problem of collections-722 as well.

However the suggested changes in this PR still would save additional CPU cycles, but I am personally of the opinion, that added complexity and hard to understand code do not justify these small benefits.

@ppkarwasz : What do you think ? Shall we still continue to pursue a change here ? If yes, I'd try your last suggestion via subclassing IteratorChain.

But atm I favor dropping this PR and closing the jira issue 722 as duplicate to collections-838

@ppkarwasz
Copy link
Contributor

Hi @JoergBudi,

I think this PR is still worth pursuing, but we could narrow its scope to just collapsing IteratorChain trees, and handle the Unmodifiable aspect in a separate PR.

For the unmodifiable part, I’ve experimented with the following approach:

  1. Introduce an UnmodifiableConvertible interface for classes that have an Unmodifiable equivalent:

    public interface UnmodifiableConvertible {
    
        /**
         * Returns an unmodifiable version of this instance.
         * <p>
         * The returned object must implement {@link Unmodifiable} and preserve all
         * public types (classes and interfaces) of the original instance.
         * </p>
         *
         * @return an unmodifiable version of this instance
         */
        Unmodifiable toUnmodifiable();
    }
  2. Update all existing Iterator implementations to either:

    • implement Unmodifiable (if they already throw unconditionally on remove()), or
    • implement UnmodifiableConvertible.

This change would touch a lot of lines, so I plan to submit it separately. It would however enable collapsing nested iterators when they are interleaved with UnmodifiableIterator.

One open question: should toUnmodifiable() be allowed to return this, or should it always return a new instance?

The term “unmodifiable” is a bit misleading here: these iterators still maintain internal mutable state; the restriction applies only to the remove() operation. That’s why I’m leaning toward having toUnmodifiable() always return a distinct “snapshot” reflecting the current state.

For IteratorChain, however, that snapshot would still share the same underlying iterators. In that case, it might make sense to specify that once toUnmodifiable() is called, the original instance should no longer be used, which would make returning this acceptable.

What do you think?

@JoergBudi
Copy link
Contributor Author

@ppkarwasz : About IteratorChain and returning this (e.g. with a boolean marking remove() to throw an exception): While semantically that would work you would violate the marker interface contract (you can't dynamically add Unmodifiable to the instance so you either lie before or after calling toUnmodifiable()). So I think you continue to need similar unwrapping code knowing the possible classes one nesting level deeper, however it is difficult to think that through without actually trying it out.

About this PR: I checked, that on a millisecond scale, the performance benefit of this PR is not really measurable up to a 1000 nestings. After that (5k nestings), the PR indeed avoids the StackOverflowError, but as no one has created a defect about that, I guess so deep nestings are uncommon. So have you got suggestions how to go on here ? Shall we merge the PR as is or have you got other improvements in mind beside the UnmodifiableConvertible I could add ?

@ppkarwasz
Copy link
Contributor

@JoergBudi,

About IteratorChain and returning this (e.g. with a boolean marking remove() to throw an exception): While semantically that would work you would violate the marker interface contract (you can't dynamically add Unmodifiable to the instance so you either lie before or after calling toUnmodifiable()). So I think you continue to need similar unwrapping code knowing the possible classes one nesting level deeper, however it is difficult to think that through without actually trying it out.

When I talked about returning this, I was rather thinking that IteratorChain.toUnmodifiable() would return a new derivative class UnmodifiableIteratorChain, while UnmodifiableIteratorChain.toUnmodifiable() would return this.

About this PR: I checked, that on a millisecond scale, the performance benefit of this PR is not really measurable up to a 1000 nestings. After that (5k nestings), the PR indeed avoids the StackOverflowError, but as no one has created a defect about that, I guess so deep nestings are uncommon. So have you got suggestions how to go on here ? Shall we merge the PR as is or have you got other improvements in mind beside the UnmodifiableConvertible I could add ?

I think we should merge it as is. Can you add yourself to the changes.xml file? @garydgregory, any objections?

@JoergBudi
Copy link
Contributor Author

added line changes.xml for this PR.
@ppkarwasz: about the other discussion, you first wrote, that an iterator would either implement Unmodifiable OR UnmodifiableConvertible, but not both. However, I get, that it is convenient for the user have both interfaces implemented so that no if/else is needed when using it, and then returning "this" seems ok to me. But I don't think, that the UnmodifiableIteratorChain would be really that special, e.g. FilterIterator would share the same issue, that the parent FilterIterator might not be usable anymore.

@ppkarwasz ppkarwasz merged commit 845c0e5 into apache:master Aug 19, 2025
9 checks passed
@ppkarwasz
Copy link
Contributor

But I don't think, that the UnmodifiableIteratorChain would be really that special, e.g. FilterIterator would share the same issue, that the parent FilterIterator might not be usable anymore.

Yes, UnmodifiableIteratorChain wouldn't be special, but there are a couple of “resettable” iterators, where the status of the iterators can be reset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants