Skip to content

Conversation

@wolf31o2
Copy link
Member

@wolf31o2 wolf31o2 commented Dec 16, 2025


Summary by cubic

Implements CIP-0137 distributed message queue with node-to-node and node-to-client protocols, plus authentication and TTL checks. Enables secure message submission and notifications with blocking and non-blocking flows.

  • New Features
    • Common DMQ types, message authentication (opcert/KES/pool registration), and TTL validation.
    • MessageSubmission (ID 17): init, request/reply for message IDs and messages, blocking/non-blocking, bounded queue.
    • LocalMessageSubmission (ID 18): submit messages; accept/reject with standard reasons; handler hook.
    • LocalMessageNotification (ID 19): request messages (blocking/non-blocking); internal queue; wait-for-message signaling.
    • CBOR encoding/decoding, protocol state machines, and timeouts per CIP-0137, with tests for authentication, TTL, and message formats.

Written for commit b3a9518. Summary will update automatically on new commits.

Summary by CodeRabbit

  • New Features

    • End-to-end message authenticator with certificate & signature checks, pluggable KES verifier, SPO pool tracking, KES period rotation checks, and TTL validation
    • New DMQ message payloads and structured rejection reasons (invalid, already received, expired, other)
    • Added Local Message Notification, Local Message Submission, and Message Submission protocols with full state machines and blocking/non-blocking flows
  • Tests

    • Extensive unit tests covering authentication, TTL behavior, KES verifier paths, message types, and protocol state transitions

✏️ Tip: You can customize this high-level summary in your review settings.

@wolf31o2 wolf31o2 requested a review from a team as a code owner December 16, 2025 18:05
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

This PR adds CIP-0137–style message authentication and TTL validation, DMQ data structures, and three protocol implementations: Local Message Notification, Local Message Submission, and Message Submission. It introduces Message, MessagePayload, and OperationalCertificate types; a MessageAuthenticator with operational-certificate and KES verification (pluggable verifier, cache, pool tracking, and package-level default), a TTLValidator, per-protocol message types with CBOR parsing (NewMsgFromCbor), state machines with timeouts, queuing/dispatch logic, and comprehensive unit tests for authentication and protocol messages.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

  • protocol/common/authentication.go: MessageAuthenticator (KES flow, opcert cache, pool tracking, atomic default verifier), TTLValidator logic — high logic density and security-sensitive code.
  • protocol/common/dmq.go: new DMQ types and RejectReason implementations — verify correctness and edge cases.
  • protocol/localmessagenotification/, protocol/localmessagesubmission/, protocol/messagesubmission/*: three full protocol implementations with state machines, timeouts, concurrency, queueing, and CBOR message handling — heterogeneous, dense changes requiring careful review.
  • Tests: protocol/common/authentication_test.go and multiple message tests — ensure coverage matches implementation and verify cryptographic/mock paths.
  • Concurrency and synchronization across new code (mutexes, channels) and public API surface area.

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and specifically describes the main change: implementing a distributed message queue following CIP-0137. It concisely identifies the feature and standard being implemented.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/dmq-cip-0137

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e2643b and b3a9518.

📒 Files selected for processing (12)
  • protocol/common/authentication.go (1 hunks)
  • protocol/common/authentication_test.go (1 hunks)
  • protocol/common/dmq.go (1 hunks)
  • protocol/localmessagenotification/localmessagenotification.go (1 hunks)
  • protocol/localmessagenotification/messages.go (1 hunks)
  • protocol/localmessagenotification/messages_test.go (1 hunks)
  • protocol/localmessagesubmission/localmessagesubmission.go (1 hunks)
  • protocol/localmessagesubmission/messages.go (1 hunks)
  • protocol/localmessagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messages.go (1 hunks)
  • protocol/messagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messagesubmission.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • protocol/localmessagesubmission/messages_test.go
  • protocol/localmessagenotification/messages.go
  • protocol/localmessagesubmission/messages.go
  • protocol/localmessagenotification/messages_test.go
  • protocol/messagesubmission/messages_test.go
  • protocol/common/dmq.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-04T15:54:22.683Z
Learnt from: wolf31o2
Repo: blinklabs-io/gouroboros PR: 1093
File: ledger/mary/pparams.go:143-150
Timestamp: 2025-11-04T15:54:22.683Z
Learning: In the blinklabs-io/gouroboros repository, the design goal for CBOR round-trip tests is to achieve byte-identical encoding WITHOUT using stored CBOR (cbor.DecodeStoreCbor). Instead, the approach uses proper field types (pointers for optional fields) and relies on the cbor package's deterministic encoding (SortCoreDeterministic) to ensure reproducible output. The stored CBOR pattern should not be suggested as a solution for round-trip fidelity in this codebase.

Applied to files:

  • protocol/messagesubmission/messages.go
  • protocol/common/authentication.go
🧬 Code graph analysis (4)
protocol/localmessagesubmission/localmessagesubmission.go (4)
protocol/state.go (6)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • AgencyClient (27-27)
  • StateTransition (51-55)
  • AgencyServer (28-28)
  • AgencyNone (26-26)
protocol/localmessagesubmission/messages.go (11)
  • MessageTypeSubmitMessage (32-32)
  • MessageTypeDone (35-35)
  • MessageTypeAcceptMessage (33-33)
  • MessageTypeRejectMessage (34-34)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (61-87)
  • MsgSubmitMessage (39-42)
  • MsgAcceptMessage (45-47)
  • MsgRejectMessage (50-53)
  • MsgDone (56-58)
protocol/messagesubmission/messages.go (5)
  • MessageTypeDone (37-37)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (77-105)
  • MsgDone (72-74)
protocol/common/dmq.go (6)
  • Message (53-63)
  • RejectReason (83-86)
  • InvalidReason (89-92)
  • OtherReason (116-119)
  • AlreadyReceivedReason (100-100)
  • ExpiredReason (108-108)
protocol/localmessagenotification/localmessagenotification.go (2)
protocol/state.go (6)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • AgencyClient (27-27)
  • StateTransition (51-55)
  • AgencyServer (28-28)
  • AgencyNone (26-26)
protocol/localmessagenotification/messages.go (10)
  • MessageTypeRequestMessages (32-32)
  • MsgRequestMessages (39-42)
  • MessageTypeClientDone (35-35)
  • MessageTypeReplyMessagesNonBlocking (33-33)
  • MessageTypeReplyMessagesBlocking (34-34)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • MsgReplyMessagesNonBlocking (45-49)
  • MsgReplyMessagesBlocking (52-55)
  • MsgClientDone (58-60)
protocol/messagesubmission/messagesubmission.go (4)
protocol/state.go (3)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • StateTransition (51-55)
protocol/common/dmq.go (2)
  • Message (53-63)
  • MessageIDAndSize (66-72)
protocol/localmessagenotification/messages.go (5)
  • MessageTypeRequestMessages (32-32)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (63-87)
  • MsgRequestMessages (39-42)
protocol/localmessagesubmission/messages.go (5)
  • MessageTypeDone (35-35)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (61-87)
  • MsgDone (56-58)
protocol/common/authentication_test.go (3)
protocol/common/authentication.go (3)
  • NewMessageAuthenticator (85-97)
  • SetDefaultKESVerifier (64-68)
  • ApplyDefaultKESVerifier (72-82)
protocol/common/dmq.go (3)
  • Message (53-63)
  • MessagePayload (27-37)
  • OperationalCertificate (40-50)
cbor/encode.go (1)
  • Encode (28-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Nitpick comments (11)
protocol/localmessagesubmission/messages_test.go (2)

25-51: Test name suggests encoding but only verifies Type() method.

The test TestMsgSubmitMessageEncoding doesn't actually test CBOR encoding—it only verifies the Type() method and field access. Consider either renaming to TestMsgSubmitMessageType or adding actual encode/decode round-trip assertions to match the test name. Based on retrieved learnings, this codebase prefers byte-identical CBOR round-trip testing without stored CBOR.


137-142: Consider adding positive test cases for NewMsgFromCbor.

Currently only the unknown type error path is tested. Adding tests with valid CBOR payloads for each message type would improve coverage of the decode success path.

protocol/common/authentication_test.go (1)

33-43: Tests access internal map fields directly.

Accessing auth.spoPoolIDs directly couples tests to implementation details. If the internal data structure changes, these tests will break. Consider adding getter methods or using public API to verify registration state.

protocol/localmessagesubmission/messages.go (1)

82-85: Nil check is redundant.

After the switch statement, ret is always non-nil (the default case returns an error). The if ret != nil check on line 82 is unnecessary. However, this pattern is consistent with other protocol implementations (localmessagenotification, messagesubmission), so keeping it maintains consistency.

protocol/messagesubmission/messages.go (1)

102-105: Redundant nil check.

After the switch statement, ret is never nil when reaching this point—either a concrete type was assigned (lines 81-91) or an error was returned (lines 93-97). Removing this check simplifies the code.

-	if ret != nil {
-		// Store the raw message CBOR
-		ret.SetCbor(data)
-	}
+	// Store the raw message CBOR
+	ret.SetCbor(data)
protocol/localmessagenotification/messages.go (1)

84-87: Same redundant nil check as in messagesubmission.

After the switch, ret cannot be nil when reaching this point.

-	if ret != nil {
-		// Store the raw message CBOR
-		ret.SetCbor(data)
-	}
+	// Store the raw message CBOR
+	ret.SetCbor(data)
protocol/localmessagesubmission/localmessagesubmission.go (3)

271-277: Redundant fallback in Send method.

When l.sendFn is nil, the code calls l.SendMessage(msg). However, sendFn is always set to proto.SendMessage in NewLocalMessageSubmission (line 146), so the fallback branch is unreachable in normal usage.


340-350: handleRejectMessage discards the rejection reason.

The msg.Reason is not logged or propagated. Consider logging the rejection reason for debugging purposes.

 func (l *LocalMessageSubmission) handleRejectMessage(
 	msg *MsgRejectMessage,
 ) error {
 	l.lock.Lock()
 	defer l.lock.Unlock()
 	if l.state != StateBusy {
 		return fmt.Errorf("invalid state for reject: %v", l.state)
 	}
+	// Log rejection for debugging (logger could be added to struct)
 	l.state = StateIdle
 	return nil
 }

352-358: handleDone does not validate state before transitioning.

Unlike other handlers, handleDone does not check the current state before transitioning to StateDone. Per the state map, Done is only valid from Idle state. Consider adding a state check for consistency.

 func (l *LocalMessageSubmission) handleDone() error {
 	l.lock.Lock()
 	defer l.lock.Unlock()
+	if l.state != StateIdle {
+		return fmt.Errorf("invalid state for done: %v", l.state)
+	}
 	l.state = StateDone
 	l.Stop()
 	return nil
 }
protocol/messagesubmission/messagesubmission.go (1)

547-548: Potential slice bounds issue with AckCount.

If req.AckCount equals len(m.pendingMessageIDs), the slice becomes empty, which is fine. However, if req.AckCount exceeds the length, the condition int(req.AckCount) <= len(m.pendingMessageIDs) prevents the truncation. This is correct but allows silently ignoring invalid ack counts. Consider logging a warning for protocol debugging.

protocol/localmessagenotification/localmessagenotification.go (1)

125-125: Unused field acknowledgedIDs.

The acknowledgedIDs map is initialized in the constructor but never read or written elsewhere in this file. Consider removing it if not needed, or implementing the acknowledgment tracking logic.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d992602 and 38df06a.

📒 Files selected for processing (12)
  • protocol/common/authentication.go (1 hunks)
  • protocol/common/authentication_test.go (1 hunks)
  • protocol/common/dmq.go (1 hunks)
  • protocol/localmessagenotification/localmessagenotification.go (1 hunks)
  • protocol/localmessagenotification/messages.go (1 hunks)
  • protocol/localmessagenotification/messages_test.go (1 hunks)
  • protocol/localmessagesubmission/localmessagesubmission.go (1 hunks)
  • protocol/localmessagesubmission/messages.go (1 hunks)
  • protocol/localmessagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messages.go (1 hunks)
  • protocol/messagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messagesubmission.go (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-04T15:54:22.683Z
Learnt from: wolf31o2
Repo: blinklabs-io/gouroboros PR: 1093
File: ledger/mary/pparams.go:143-150
Timestamp: 2025-11-04T15:54:22.683Z
Learning: In the blinklabs-io/gouroboros repository, the design goal for CBOR round-trip tests is to achieve byte-identical encoding WITHOUT using stored CBOR (cbor.DecodeStoreCbor). Instead, the approach uses proper field types (pointers for optional fields) and relies on the cbor package's deterministic encoding (SortCoreDeterministic) to ensure reproducible output. The stored CBOR pattern should not be suggested as a solution for round-trip fidelity in this codebase.

Applied to files:

  • protocol/messagesubmission/messages_test.go
🧬 Code graph analysis (9)
protocol/messagesubmission/messages.go (3)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/common/dmq.go (2)
  • MessageIDAndSize (61-66)
  • Message (49-58)
cbor/decode.go (1)
  • Decode (29-44)
protocol/common/authentication.go (2)
protocol/common/dmq.go (2)
  • Message (49-58)
  • OperationalCertificate (37-46)
cbor/encode.go (1)
  • Encode (28-41)
protocol/localmessagesubmission/messages.go (3)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/localmessagenotification/messages.go (2)
  • NewMsgFromCbor (63-89)
  • ProtocolName (26-26)
protocol/messagesubmission/messages.go (4)
  • NewMsgFromCbor (77-107)
  • MessageTypeDone (37-37)
  • MsgDone (72-74)
  • ProtocolName (26-26)
protocol/common/authentication_test.go (2)
protocol/common/authentication.go (2)
  • NewMessageAuthenticator (43-53)
  • NewTTLValidator (254-271)
protocol/common/dmq.go (3)
  • Message (49-58)
  • MessagePayload (25-34)
  • OperationalCertificate (37-46)
protocol/localmessagenotification/messages.go (4)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/common/dmq.go (1)
  • Message (49-58)
protocol/localmessagesubmission/messages.go (2)
  • NewMsgFromCbor (61-87)
  • ProtocolName (26-26)
cbor/decode.go (1)
  • Decode (29-44)
protocol/localmessagesubmission/messages_test.go (3)
protocol/localmessagesubmission/messages.go (9)
  • MsgSubmitMessage (39-42)
  • MessageTypeSubmitMessage (32-32)
  • MsgAcceptMessage (45-47)
  • MessageTypeAcceptMessage (33-33)
  • MsgRejectMessage (50-53)
  • MessageTypeRejectMessage (34-34)
  • MsgDone (56-58)
  • MessageTypeDone (35-35)
  • NewMsgFromCbor (61-87)
protocol/common/dmq.go (7)
  • Message (49-58)
  • MessagePayload (25-34)
  • OperationalCertificate (37-46)
  • InvalidReason (83-86)
  • AlreadyReceivedReason (94-94)
  • ExpiredReason (102-102)
  • OtherReason (110-113)
protocol/localmessagenotification/messages.go (1)
  • NewMsgFromCbor (63-89)
protocol/localmessagenotification/localmessagenotification.go (5)
protocol/state.go (6)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • AgencyClient (27-27)
  • StateTransition (51-55)
  • AgencyServer (28-28)
  • AgencyNone (26-26)
protocol/localmessagenotification/messages.go (7)
  • MessageTypeRequestMessages (32-32)
  • MsgRequestMessages (39-42)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • MsgReplyMessagesNonBlocking (45-49)
  • MsgReplyMessagesBlocking (52-55)
  • MsgClientDone (58-60)
protocol/protocol.go (5)
  • Protocol (39-60)
  • ProtocolOptions (99-108)
  • ProtocolConfig (63-77)
  • MessageFromCborFunc (119-119)
  • MessageHandlerFunc (116-116)
protocol/common/authentication.go (4)
  • MessageAuthenticator (32-40)
  • TTLValidator (247-250)
  • NewMessageAuthenticator (43-53)
  • NewTTLValidator (254-271)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/localmessagesubmission/localmessagesubmission.go (4)
protocol/state.go (6)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • AgencyClient (27-27)
  • StateTransition (51-55)
  • AgencyServer (28-28)
  • AgencyNone (26-26)
protocol/messagesubmission/messages.go (5)
  • MessageTypeDone (37-37)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (77-107)
  • MsgDone (72-74)
protocol/protocol.go (5)
  • Protocol (39-60)
  • ProtocolOptions (99-108)
  • ProtocolConfig (63-77)
  • MessageFromCborFunc (119-119)
  • MessageHandlerFunc (116-116)
protocol/common/authentication.go (4)
  • MessageAuthenticator (32-40)
  • TTLValidator (247-250)
  • NewMessageAuthenticator (43-53)
  • NewTTLValidator (254-271)
protocol/messagesubmission/messagesubmission.go (4)
protocol/state.go (6)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • AgencyClient (27-27)
  • StateTransition (51-55)
  • AgencyServer (28-28)
  • AgencyNone (26-26)
protocol/messagesubmission/messages.go (11)
  • MessageTypeInit (32-32)
  • MessageTypeRequestMessageIds (33-33)
  • MsgRequestMessageIds (46-51)
  • MessageTypeRequestMessages (35-35)
  • MessageTypeDone (37-37)
  • MessageTypeReplyMessageIds (34-34)
  • MessageTypeReplyMessages (36-36)
  • MsgReplyMessageIds (54-57)
  • MsgRequestMessages (60-63)
  • MsgReplyMessages (66-69)
  • MsgDone (72-74)
protocol/protocol.go (5)
  • Protocol (39-60)
  • ProtocolOptions (99-108)
  • ProtocolConfig (63-77)
  • MessageFromCborFunc (119-119)
  • MessageHandlerFunc (116-116)
protocol/common/authentication.go (2)
  • MessageAuthenticator (32-40)
  • NewMessageAuthenticator (43-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Analyze (go)
🔇 Additional comments (23)
protocol/localmessagenotification/messages_test.go (1)

1-164: Test structure is consistent with other protocol tests.

Good coverage of both blocking and non-blocking message variants, including the empty messages edge case. The test structure aligns well with the other protocol test files in this PR.

protocol/messagesubmission/messages_test.go (1)

36-51: Good coverage of MsgRequestMessageIds fields.

The test properly validates all three configurable fields (IsBlocking, AckCount, RequestCount) with expected values.

protocol/common/authentication_test.go (2)

147-204: Solid TTL validation test coverage.

Tests cover expired messages, valid messages within bounds, and messages too far in the future. The boundary conditions are well tested.


99-130: Excellent KES period rotation test coverage.

The test properly validates the sequential state machine: first registration succeeds, same number allowed, higher number updates, and lower number (rollback) correctly fails with appropriate error message.

protocol/localmessagesubmission/messages.go (1)

25-107: Implementation follows established protocol patterns.

The message type constants, struct definitions embedding protocol.MessageBase, NewMsgFromCbor factory, and Type() accessors all align with the existing protocol implementations in messagesubmission and localmessagenotification.

protocol/common/dmq.go (3)

68-73: IsValid() using time.Now() is acceptable for this use case.

While injecting time would improve testability, the current approach is simple and the tests work around it using time.Now().Add(). The #nosec G115 annotation is appropriate.


75-118: Clean RejectReason interface design.

The interface with four concrete implementations provides type-safe rejection handling. The reason type codes (0-3) are well-documented and implementations are consistent.


24-34: Add CBOR struct tags to MessagePayload and related DMQ structs for consistency with codebase conventions.

The MessagePayload, OperationalCertificate, Message, and MessageIDAndSize structs in protocol/common/dmq.go lack CBOR encoding directives. Every similar struct in the codebase uses _ struct{} \cbor:",toarray"`` to specify array-based CBOR encoding. Since dmq.go explicitly references CIP-0137, verify the specification's encoding requirements and add the appropriate CBOR tags to ensure deterministic, array-based serialization consistent with the rest of the protocol implementation.

protocol/messagesubmission/messages.go (3)

1-28: LGTM on protocol constants and imports.

The protocol constants (ProtocolName, ProtocolID = 17) and message type constants are well-defined and align with CIP-0137 specification.


40-74: Message type definitions look correct.

All message structs properly embed protocol.MessageBase and define fields consistent with CIP-0137 CDDL specification. The use of pcommon.MessageIDAndSize and pcommon.Message from the shared common package is appropriate.


109-137: Type() methods are correctly implemented.

Each message type returns its corresponding MessageType constant. Implementation follows the established pattern from other protocol message files.

protocol/localmessagenotification/messages.go (3)

25-36: Protocol constants defined correctly.

Protocol ID 19 and message type constants align with CIP-0137 specification for LocalMessageNotification.


44-55: Verify blocking vs non-blocking reply semantics.

MsgReplyMessagesNonBlocking has a HasMore field while MsgReplyMessagesBlocking does not. Please confirm this asymmetry is intentional per CIP-0137 CDDL—blocking replies may continue until done, while non-blocking needs to signal more availability.


91-109: Type() implementations are correct.

All methods return the appropriate message type constant.

protocol/localmessagesubmission/localmessagesubmission.go (1)

44-84: State machine and timeouts are well-defined.

The state map correctly defines transitions and timeouts per CIP-0137. Agency assignments (Client for Idle, Server for Busy, None for Done) follow the protocol specification.

protocol/messagesubmission/messagesubmission.go (3)

50-152: State machine structure is comprehensive and well-organized.

The state map correctly defines all transitions with appropriate timeouts and agencies. The use of MatchFunc for distinguishing blocking vs non-blocking RequestMessageIds is a clean approach.


681-699: collectAvailableIDs called without lock may cause data race.

This helper accesses m.messageQueue but is called from handleRequestMessageIds after the lock is released (line 588 unlocks before line 590 calls sendFn). Wait—looking more carefully, line 564 calls collectAvailableIDs while the lock is still held. However, the function accesses m.messageQueue which requires synchronization.

Tracing the call at line 564: the lock is acquired at line 540 and is still held. This is safe. Approved.


701-710: lookupMessageSize iterates messageQueue without synchronization.

This function is called from within handleRequestMessageIds at line 578, where the lock is still held (acquired at line 540, released at line 588). This appears safe in the current call path.

protocol/common/authentication.go (2)

273-304: TTL validation implementation is correct.

The validation logic properly checks both expired messages and messages with expiration too far in the future. The #nosec G115 annotation for the uint32 cast is appropriate given the 2106 overflow note.


306-320: GetTimeUntilExpiration is well-implemented.

Handles nil message and expired case correctly. Returns proper time.Duration.

protocol/localmessagenotification/localmessagenotification.go (3)

300-331: Good enforcement of CIP-0137 blocking reply constraint.

The check at line 315-318 correctly enforces that blocking replies must contain at least one message per the specification.


397-403: LGTM!

The sendFn hook pattern enables dependency injection for testing while providing a sensible fallback.


512-537: Partial enqueue on validation failure.

If validation fails for message N, messages 0 through N-1 have already been enqueued and signaled. If all-or-nothing semantics are expected, validate all messages before enqueuing. Additionally, line 532 has the same closed-channel panic risk as noted earlier.

Is partial enqueue the intended behavior, or should the batch be atomic? If atomic:

 func (l *LocalMessageNotification) enqueueMessages(
 	messages []pcommon.Message,
 ) error {
+	// Validate all messages first
+	for _, message := range messages {
+		if l.ttlValidator != nil {
+			if err := l.ttlValidator.ValidateMessageTTL(&message); err != nil {
+				return err
+			}
+		}
+		if l.authenticator != nil {
+			if err := l.authenticator.VerifyMessage(&message); err != nil {
+				return err
+			}
+		}
+	}
+	// Check capacity for entire batch
+	if len(l.messageQueue)+len(messages) > l.maxQueueSize {
+		return errors.New("message queue is full")
+	}
+	// Enqueue all
 	for _, message := range messages {
-		// ... validation and capacity checks ...
 		msgCopy := message
 		l.messageQueue = append(l.messageQueue, &msgCopy)
-		select {
-		case l.newMessageSignal <- struct{}{}:
-		default:
-		}
 	}
+	// Signal once after batch
+	select {
+	case l.newMessageSignal <- struct{}{}:
+	default:
+	}
 	return nil
 }

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 12 files

Prompt for AI agents (all 5 issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="protocol/localmessagesubmission/localmessagesubmission.go">

<violation number="1" location="protocol/localmessagesubmission/localmessagesubmission.go:163">
P1: State is set to `StateBusy` before validation, but if validation fails, the state is not rolled back. This leaves the protocol in an inconsistent state where subsequent `SubmitMessage` calls will fail. Either perform validation before changing state, or reset state to `StateIdle` on validation failure.</violation>
</file>

<file name="protocol/common/authentication.go">

<violation number="1" location="protocol/common/authentication.go:188">
P0: Critical security issue: KES signature verification is not implemented. This function checks structure but always returns success without actually verifying the signature. Any forged message with correct byte lengths will pass authentication. At minimum, add a TODO comment or return an error indicating the stub status, rather than silently accepting all signatures.</violation>
</file>

<file name="protocol/messagesubmission/messages_test.go">

<violation number="1" location="protocol/messagesubmission/messages_test.go:141">
P2: Test `TestNewMsgFromCborInit` doesn&#39;t test what it claims. The name says it tests &quot;parsing MsgInit from CBOR&quot; but it never calls `NewMsgFromCbor`. Consider either renaming the test or actually testing CBOR parsing like `TestNewMsgFromCborUnknownType` does.</violation>
</file>

<file name="protocol/messagesubmission/messagesubmission.go">

<violation number="1" location="protocol/messagesubmission/messagesubmission.go:626">
P1: Data race: `collectMessagesByID` is called without holding the lock, but it accesses `m.messageQueue` which can be concurrently modified by other methods. Either hold the lock during the call or make `collectMessagesByID` acquire its own lock.</violation>
</file>

<file name="protocol/localmessagenotification/localmessagenotification.go">

<violation number="1" location="protocol/localmessagenotification/localmessagenotification.go:377">
P1: Messages are never consumed from the queue. All non-expired messages are added to `filtered` regardless of whether they were already returned in `available`. This causes duplicate message delivery on subsequent calls. The message should only be added to `filtered` if it wasn&#39;t already added to `available`.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

l.state,
)
}
l.state = StateBusy
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: State is set to StateBusy before validation, but if validation fails, the state is not rolled back. This leaves the protocol in an inconsistent state where subsequent SubmitMessage calls will fail. Either perform validation before changing state, or reset state to StateIdle on validation failure.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/localmessagesubmission/localmessagesubmission.go, line 163:

<comment>State is set to `StateBusy` before validation, but if validation fails, the state is not rolled back. This leaves the protocol in an inconsistent state where subsequent `SubmitMessage` calls will fail. Either perform validation before changing state, or reset state to `StateIdle` on validation failure.</comment>

<file context>
@@ -0,0 +1,368 @@
+			l.state,
+		)
+	}
+	l.state = StateBusy
+	l.lock.Unlock()
+
</file context>
Fix with Cubic


// In production, this would use actual KES signature verification
// For now, we just verify structure. A full implementation would need KES library
m.logger.Debug("KES signature verified", "payload_size", len(wrappedCbor))
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0: Critical security issue: KES signature verification is not implemented. This function checks structure but always returns success without actually verifying the signature. Any forged message with correct byte lengths will pass authentication. At minimum, add a TODO comment or return an error indicating the stub status, rather than silently accepting all signatures.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/common/authentication.go, line 188:

<comment>Critical security issue: KES signature verification is not implemented. This function checks structure but always returns success without actually verifying the signature. Any forged message with correct byte lengths will pass authentication. At minimum, add a TODO comment or return an error indicating the stub status, rather than silently accepting all signatures.</comment>

<file context>
@@ -0,0 +1,320 @@
+
+	// In production, this would use actual KES signature verification
+	// For now, we just verify structure. A full implementation would need KES library
+	m.logger.Debug(&quot;KES signature verified&quot;, &quot;payload_size&quot;, len(wrappedCbor))
+
+	return nil
</file context>
Fix with Cubic

}

// TestNewMsgFromCborInit tests parsing MsgInit from CBOR
func TestNewMsgFromCborInit(t *testing.T) {
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Test TestNewMsgFromCborInit doesn't test what it claims. The name says it tests "parsing MsgInit from CBOR" but it never calls NewMsgFromCbor. Consider either renaming the test or actually testing CBOR parsing like TestNewMsgFromCborUnknownType does.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/messagesubmission/messages_test.go, line 141:

<comment>Test `TestNewMsgFromCborInit` doesn&#39;t test what it claims. The name says it tests &quot;parsing MsgInit from CBOR&quot; but it never calls `NewMsgFromCbor`. Consider either renaming the test or actually testing CBOR parsing like `TestNewMsgFromCborUnknownType` does.</comment>

<file context>
@@ -0,0 +1,158 @@
+}
+
+// TestNewMsgFromCborInit tests parsing MsgInit from CBOR
+func TestNewMsgFromCborInit(t *testing.T) {
+	// This would require actual CBOR encoding/decoding
+	// For now, just test the message type routing
</file context>

✅ Addressed in f673e27

m.state = StateMessages
m.lock.Unlock()

messages := m.collectMessagesByID(req.MessageIDs)
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Data race: collectMessagesByID is called without holding the lock, but it accesses m.messageQueue which can be concurrently modified by other methods. Either hold the lock during the call or make collectMessagesByID acquire its own lock.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/messagesubmission/messagesubmission.go, line 626:

<comment>Data race: `collectMessagesByID` is called without holding the lock, but it accesses `m.messageQueue` which can be concurrently modified by other methods. Either hold the lock during the call or make `collectMessagesByID` acquire its own lock.</comment>

<file context>
@@ -0,0 +1,745 @@
+	m.state = StateMessages
+	m.lock.Unlock()
+
+	messages := m.collectMessagesByID(req.MessageIDs)
+
+	reply := &amp;MsgReplyMessages{
</file context>

✅ Addressed in b3a9518

if len(available) < maxCount {
available = append(available, *msg)
}
filtered = append(filtered, msg)
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Messages are never consumed from the queue. All non-expired messages are added to filtered regardless of whether they were already returned in available. This causes duplicate message delivery on subsequent calls. The message should only be added to filtered if it wasn't already added to available.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/localmessagenotification/localmessagenotification.go, line 377:

<comment>Messages are never consumed from the queue. All non-expired messages are added to `filtered` regardless of whether they were already returned in `available`. This causes duplicate message delivery on subsequent calls. The message should only be added to `filtered` if it wasn&#39;t already added to `available`.</comment>

<file context>
@@ -0,0 +1,585 @@
+			if len(available) &lt; maxCount {
+				available = append(available, *msg)
+			}
+			filtered = append(filtered, msg)
+		}
+	}
</file context>
Fix with Cubic

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
protocol/messagesubmission/messages_test.go (1)

17-27: Consolidate duplicate import blocks.

There are two separate import blocks. Consider merging them into a single block for idiomatic Go style.

 import (
 	"testing"

+	"github.com/blinklabs-io/gouroboros/cbor"
 	"github.com/blinklabs-io/gouroboros/protocol"
 	pcommon "github.com/blinklabs-io/gouroboros/protocol/common"
 	"github.com/stretchr/testify/assert"
 )
-
-import (
-	"github.com/blinklabs-io/gouroboros/cbor"
-)
protocol/localmessagesubmission/localmessagesubmission.go (1)

342-362: Handlers handleAcceptMessage and handleRejectMessage don't use the message content.

handleRejectMessage receives msg *MsgRejectMessage but ignores msg.Reason. Consider logging or exposing the rejection reason for observability.

 func (l *LocalMessageSubmission) handleRejectMessage(
 	msg *MsgRejectMessage,
 ) error {
 	l.lock.Lock()
 	defer l.lock.Unlock()
 	if l.state != StateBusy {
 		return fmt.Errorf("invalid state for reject: %v", l.state)
 	}
 	l.state = StateIdle
+	// TODO: Consider logging or exposing msg.Reason for observability
 	return nil
 }
protocol/localmessagenotification/localmessagenotification.go (1)

282-308: Consider deferring state transition until after successful send.

The state transitions to Idle (line 296) before sending the reply (line 307). If Send fails, the state is inconsistent—the peer never received the response, but local state is already Idle. Consider sending first, then updating state only on success, matching the pattern in handleRequestMessages (lines 472-478).

Apply this pattern to both blocking and non-blocking cases:

-	l.state = StateIdle
-	l.lock.Unlock()
-
 	msg := &MsgReplyMessagesNonBlocking{
 		MessageBase: protocol.MessageBase{
 			MessageType: MessageTypeReplyMessagesNonBlocking,
 		},
 		Messages: messages,
 		HasMore:  hasMore,
 	}
 
-	return l.Send(msg)
+	if err := l.Send(msg); err != nil {
+		return err
+	}
+	l.lock.Lock()
+	l.state = StateIdle
+	l.lock.Unlock()
+	return nil
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 38df06a and f673e27.

📒 Files selected for processing (12)
  • protocol/common/authentication.go (1 hunks)
  • protocol/common/authentication_test.go (1 hunks)
  • protocol/common/dmq.go (1 hunks)
  • protocol/localmessagenotification/localmessagenotification.go (1 hunks)
  • protocol/localmessagenotification/messages.go (1 hunks)
  • protocol/localmessagenotification/messages_test.go (1 hunks)
  • protocol/localmessagesubmission/localmessagesubmission.go (1 hunks)
  • protocol/localmessagesubmission/messages.go (1 hunks)
  • protocol/localmessagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messages.go (1 hunks)
  • protocol/messagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messagesubmission.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • protocol/common/authentication_test.go
  • protocol/localmessagenotification/messages_test.go
  • protocol/localmessagenotification/messages.go
  • protocol/messagesubmission/messages.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-04T15:54:22.683Z
Learnt from: wolf31o2
Repo: blinklabs-io/gouroboros PR: 1093
File: ledger/mary/pparams.go:143-150
Timestamp: 2025-11-04T15:54:22.683Z
Learning: In the blinklabs-io/gouroboros repository, the design goal for CBOR round-trip tests is to achieve byte-identical encoding WITHOUT using stored CBOR (cbor.DecodeStoreCbor). Instead, the approach uses proper field types (pointers for optional fields) and relies on the cbor package's deterministic encoding (SortCoreDeterministic) to ensure reproducible output. The stored CBOR pattern should not be suggested as a solution for round-trip fidelity in this codebase.

Applied to files:

  • protocol/messagesubmission/messages_test.go
  • protocol/common/authentication.go
  • protocol/localmessagesubmission/messages_test.go
🧬 Code graph analysis (5)
protocol/messagesubmission/messages_test.go (2)
protocol/messagesubmission/messages.go (13)
  • MsgInit (41-43)
  • MessageTypeInit (32-32)
  • MsgRequestMessageIds (46-51)
  • MessageTypeRequestMessageIds (33-33)
  • MsgReplyMessageIds (54-57)
  • MessageTypeReplyMessageIds (34-34)
  • MsgRequestMessages (60-63)
  • MessageTypeRequestMessages (35-35)
  • MsgReplyMessages (66-69)
  • MessageTypeReplyMessages (36-36)
  • MsgDone (72-74)
  • MessageTypeDone (37-37)
  • NewMsgFromCbor (77-107)
protocol/localmessagenotification/messages_test.go (1)
  • TestNewMsgFromCborUnknownType (160-164)
protocol/localmessagesubmission/messages.go (2)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/localmessagenotification/messages.go (2)
  • NewMsgFromCbor (63-89)
  • ProtocolName (26-26)
protocol/localmessagenotification/localmessagenotification.go (5)
protocol/state.go (6)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • AgencyClient (27-27)
  • StateTransition (51-55)
  • AgencyServer (28-28)
  • AgencyNone (26-26)
protocol/localmessagenotification/messages.go (11)
  • MessageTypeRequestMessages (32-32)
  • MsgRequestMessages (39-42)
  • MessageTypeClientDone (35-35)
  • MessageTypeReplyMessagesNonBlocking (33-33)
  • MessageTypeReplyMessagesBlocking (34-34)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (63-89)
  • MsgReplyMessagesNonBlocking (45-49)
  • MsgReplyMessagesBlocking (52-55)
  • MsgClientDone (58-60)
protocol/protocol.go (4)
  • Protocol (39-60)
  • ProtocolConfig (63-77)
  • MessageFromCborFunc (119-119)
  • MessageHandlerFunc (116-116)
protocol/common/authentication.go (5)
  • MessageAuthenticator (33-53)
  • TTLValidator (385-388)
  • NewMessageAuthenticator (78-90)
  • ApplyDefaultKESVerifier (68-75)
  • NewTTLValidator (392-409)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/messagesubmission/messagesubmission.go (3)
protocol/state.go (2)
  • StateMapEntry (62-67)
  • StateTransition (51-55)
protocol/protocol.go (5)
  • Protocol (39-60)
  • ProtocolOptions (99-108)
  • ProtocolConfig (63-77)
  • MessageFromCborFunc (119-119)
  • MessageHandlerFunc (116-116)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/localmessagesubmission/messages_test.go (3)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/localmessagenotification/messages.go (1)
  • NewMsgFromCbor (63-89)
protocol/messagesubmission/messages.go (3)
  • NewMsgFromCbor (77-107)
  • MsgDone (72-74)
  • MessageTypeDone (37-37)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (19)
protocol/localmessagesubmission/messages_test.go (1)

1-175: Well-structured test coverage for LocalMessageSubmission messages.

The tests comprehensively cover:

  • Message type verification for all four message types
  • CBOR round-trip encoding/decoding for MsgSubmitMessage
  • All RejectReason variants with correct type codes (0-3)
  • Error handling for unknown message types
protocol/common/dmq.go (2)

68-73: Appropriate handling of timestamp comparison.

The #nosec G115 annotation correctly documents that Unix timestamps won't overflow uint32 until 2106. The comparison logic is straightforward and correct.


75-118: Clean interface design for rejection reasons.

The RejectReason interface with four concrete implementations follows a clear pattern. Each type returns the correct code per CIP-0137 (Invalid=0, AlreadyReceived=1, Expired=2, Other=3).

protocol/messagesubmission/messages_test.go (2)

144-156: Test now properly validates CBOR round-trip.

The test correctly encodes MsgInit to CBOR and decodes it via NewMsgFromCbor, addressing the earlier review feedback about test naming vs. behavior.


165-259: Comprehensive CBOR round-trip coverage for all message types.

TestNewMsgFromCborRoundTrip validates encode/decode paths for all six message types (MsgInit, MsgRequestMessageIds, MsgReplyMessageIds, MsgRequestMessages, MsgReplyMessages, MsgDone). This aligns with the repository's goal of achieving byte-identical encoding without stored CBOR patterns. Based on learnings, this is the preferred approach.

protocol/localmessagesubmission/messages.go (2)

60-87: Factory function follows established protocol pattern.

NewMsgFromCbor correctly implements the standard pattern: switch on message type, instantiate appropriate struct, decode CBOR, store raw bytes via SetCbor, and return with wrapped errors. This matches the implementations in localmessagenotification and messagesubmission.


49-53: Interface field Reason may cause CBOR decoding issues.

MsgRejectMessage.Reason is typed as pcommon.RejectReason (interface). CBOR decoders typically cannot decode into interface types without custom unmarshalers since they don't know which concrete type to instantiate.

#!/bin/bash
# Check if there's custom CBOR unmarshaling for RejectReason
rg -n 'UnmarshalCBOR|DecMode' protocol/common/dmq.go protocol/localmessagesubmission/ --type=go

# Check how the reject message is actually used and decoded in tests
rg -n 'MsgRejectMessage' protocol/localmessagesubmission/ --type=go -A5 -B2
protocol/localmessagesubmission/localmessagesubmission.go (3)

154-202: State management correctly addresses previous review concerns.

The double-check pattern properly handles the TOCTOU (time-of-check to time-of-use) race:

  1. First lock: verify state is Idle
  2. Unlock: perform expensive validation (TTL, authentication)
  3. Second lock: re-verify state is still Idle before transitioning to Busy

This prevents the protocol from being left in an inconsistent state if validation fails.


309-340: Validation before state transition is correct, but error handling differs from client path.

handleSubmitMessage validates inbound messages before transitioning to Busy, which is correct. However, when validation fails (TTL or auth error), it returns an error rather than sending a RejectMessage to the client. This may be intentional (treat validation failures as protocol errors), but differs from how the handler's rejection path works.

Is returning an error on validation failure the intended behavior, or should it send a MsgRejectMessage with an appropriate reason (e.g., ExpiredReason or InvalidReason)?


364-373: State validation in handleDone is correct.

The handler properly checks that state is Idle before transitioning to Done and stopping the protocol, matching the state machine definition where Done is only valid from Idle.

protocol/common/authentication.go (3)

30-90: LGTM: Concurrency issues resolved.

The addition of mu sync.RWMutex at line 36, along with the KES verifier injection pattern, addresses the previous concurrency and verification concerns. The default verifier mechanism allows optional real KES verification without creating hard dependencies.


233-327: LGTM: KES signature verification properly implemented.

The updated implementation correctly handles KES verification with three modes: callback-based verification (lines 276-310), insecure bypass for tests/dev (lines 313-320), or explicit error (lines 323-327). The wrapped payload encoding at lines 260-266 now correctly produces a CBOR bytestring containing the payload without unnecessary array wrapping.


382-458: LGTM: TTL validation logic is sound.

The TTL validator correctly enforces message lifetime constraints with reasonable defaults. The uint32 arithmetic at line 431 is safe for realistic TTL values (default 30 minutes), and the implementation properly handles both expiration checks and future-timestamp bounds.

protocol/localmessagenotification/localmessagenotification.go (3)

366-404: LGTM: Queue consumption fixed.

The updated logic correctly removes messages from the queue after returning them. Messages are only kept in filtered if they weren't added to available (lines 393-395), and the acknowledgedIDs map prevents duplicate delivery. This resolves the issue from previous reviews.


440-495: LGTM: Correct state management pattern.

The handler properly sends the reply first (lines 472, 488), then updates state to Idle only after successful send (lines 475-477, 491-493). This ensures state consistency even if the send fails.


616-648: LGTM: Safe closure handling.

The done channel and newMessageSignalClosed flag prevent double-close panics. The logic is safe, though the closeChan variable at line 633 is always true and could be simplified.

protocol/messagesubmission/messagesubmission.go (3)

450-475: LGTM: SubmitDone now accepts StateIdle.

The updated state check at lines 453-455 correctly allows StateIdle in addition to StateMessageIdsBlocking and StateMessageIdsNonBlocking, matching the protocol state map definition. This resolves the previous review concern.


633-659: LGTM: Data race resolved.

The lock is now held continuously from line 637 through the call to collectMessagesByID at line 645 and the state update at line 648, preventing concurrent access to m.messageQueue. The send at line 658 occurs after releasing the lock, which is the correct pattern.


547-610: LGTM: Proper synchronization and bounds checking.

The handler correctly holds the lock while accessing m.pendingMessageIDs and m.messageQueue, includes helpful bounds checking with a warning log at line 563, and releases the lock before sending at line 609. Well-implemented.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (1)
protocol/common/authentication.go (1)

430-431: Potential uint32 overflow in TTL calculation.

If now is close to math.MaxUint32 and v.maxAllowedTTL.Seconds() is large, the addition could overflow, wrapping around to a small value. This would incorrectly allow messages with far-future expiration.

 	// Check if expiration is too far in the future (protocol violation)
-	maxAllowedExpiration := now + uint32(v.maxAllowedTTL.Seconds())
+	maxAllowedSeconds := uint32(v.maxAllowedTTL.Seconds())
+	// Guard against overflow
+	var maxAllowedExpiration uint32
+	if now > ^uint32(0) - maxAllowedSeconds {
+		maxAllowedExpiration = ^uint32(0) // cap at max
+	} else {
+		maxAllowedExpiration = now + maxAllowedSeconds
+	}

Alternatively, since Unix timestamps won't approach MaxUint32 until year 2106, this may be acceptable with the existing #nosec comment extended to cover this line.

🧹 Nitpick comments (3)
protocol/common/authentication_test.go (1)

28-34: Consider using public API for cache verification.

The test accesses internal field kesOpCertCache directly. While acceptable for white-box testing, this creates coupling to implementation details. If the internal structure changes, this test will break even if the public behavior remains correct.

protocol/messagesubmission/messages.go (1)

102-105: Redundant nil check.

The ret != nil check on line 102 is always true at this point—all switch cases assign a non-nil pointer, and the default case returns early. This is consistent with similar code in localmessagesubmission/messages.go, so it may be intentional defensive coding.

-	if ret != nil {
-		// Store the raw message CBOR
-		ret.SetCbor(data)
-	}
+	// Store the raw message CBOR
+	ret.SetCbor(data)
protocol/localmessagenotification/messages.go (1)

84-87: Same redundant nil check pattern.

Consistent with messagesubmission/messages.go, this nil check is always true. Consider removing for clarity, or keep for defensive consistency across all protocol message factories.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f673e27 and 2af25af.

📒 Files selected for processing (12)
  • protocol/common/authentication.go (1 hunks)
  • protocol/common/authentication_test.go (1 hunks)
  • protocol/common/dmq.go (1 hunks)
  • protocol/localmessagenotification/localmessagenotification.go (1 hunks)
  • protocol/localmessagenotification/messages.go (1 hunks)
  • protocol/localmessagenotification/messages_test.go (1 hunks)
  • protocol/localmessagesubmission/localmessagesubmission.go (1 hunks)
  • protocol/localmessagesubmission/messages.go (1 hunks)
  • protocol/localmessagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messages.go (1 hunks)
  • protocol/messagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messagesubmission.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • protocol/localmessagesubmission/messages.go
  • protocol/messagesubmission/messages_test.go
  • protocol/common/dmq.go
  • protocol/localmessagesubmission/messages_test.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-04T15:54:22.683Z
Learnt from: wolf31o2
Repo: blinklabs-io/gouroboros PR: 1093
File: ledger/mary/pparams.go:143-150
Timestamp: 2025-11-04T15:54:22.683Z
Learning: In the blinklabs-io/gouroboros repository, the design goal for CBOR round-trip tests is to achieve byte-identical encoding WITHOUT using stored CBOR (cbor.DecodeStoreCbor). Instead, the approach uses proper field types (pointers for optional fields) and relies on the cbor package's deterministic encoding (SortCoreDeterministic) to ensure reproducible output. The stored CBOR pattern should not be suggested as a solution for round-trip fidelity in this codebase.

Applied to files:

  • protocol/common/authentication.go
🧬 Code graph analysis (5)
protocol/localmessagesubmission/localmessagesubmission.go (3)
protocol/state.go (6)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • AgencyClient (27-27)
  • StateTransition (51-55)
  • AgencyServer (28-28)
  • AgencyNone (26-26)
protocol/localmessagesubmission/messages.go (11)
  • MessageTypeSubmitMessage (32-32)
  • MessageTypeDone (35-35)
  • MessageTypeAcceptMessage (33-33)
  • MessageTypeRejectMessage (34-34)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (61-87)
  • MsgSubmitMessage (39-42)
  • MsgAcceptMessage (45-47)
  • MsgRejectMessage (50-53)
  • MsgDone (56-58)
protocol/messagesubmission/messages.go (5)
  • MessageTypeDone (37-37)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (77-107)
  • MsgDone (72-74)
protocol/messagesubmission/messages.go (3)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/common/dmq.go (2)
  • MessageIDAndSize (66-72)
  • Message (53-63)
cbor/decode.go (1)
  • Decode (29-44)
protocol/common/authentication.go (2)
protocol/common/dmq.go (2)
  • Message (53-63)
  • OperationalCertificate (40-50)
cbor/encode.go (1)
  • Encode (28-41)
protocol/localmessagenotification/messages.go (5)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/common/dmq.go (1)
  • Message (53-63)
protocol/localmessagesubmission/messages.go (2)
  • NewMsgFromCbor (61-87)
  • ProtocolName (26-26)
protocol/messagesubmission/messages.go (4)
  • NewMsgFromCbor (77-107)
  • MessageTypeRequestMessages (35-35)
  • MsgRequestMessages (60-63)
  • ProtocolName (26-26)
cbor/decode.go (1)
  • Decode (29-44)
protocol/localmessagenotification/localmessagenotification.go (5)
protocol/state.go (6)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • AgencyClient (27-27)
  • StateTransition (51-55)
  • AgencyServer (28-28)
  • AgencyNone (26-26)
protocol/localmessagenotification/messages.go (9)
  • MessageTypeRequestMessages (32-32)
  • MsgRequestMessages (39-42)
  • MessageTypeClientDone (35-35)
  • MessageTypeReplyMessagesNonBlocking (33-33)
  • MessageTypeReplyMessagesBlocking (34-34)
  • NewMsgFromCbor (63-89)
  • MsgReplyMessagesNonBlocking (45-49)
  • MsgReplyMessagesBlocking (52-55)
  • MsgClientDone (58-60)
protocol/protocol.go (4)
  • Protocol (39-60)
  • ProtocolConfig (63-77)
  • MessageFromCborFunc (119-119)
  • MessageHandlerFunc (116-116)
protocol/common/authentication.go (5)
  • MessageAuthenticator (33-53)
  • TTLValidator (385-388)
  • NewMessageAuthenticator (78-90)
  • ApplyDefaultKESVerifier (68-75)
  • NewTTLValidator (392-409)
protocol/message.go (1)
  • MessageBase (25-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (9)
protocol/localmessagenotification/messages_test.go (1)

26-163: LGTM! Comprehensive test coverage for message types.

The tests properly verify message type constants, field values, and error handling for unknown types. The test data uses correct field sizes (448-byte KES signatures, 32-byte keys, 64-byte cold signatures) consistent with the DMQ spec.

Consider adding CBOR round-trip tests (encode → decode → verify fields) to ensure serialization correctness, though the current coverage is sufficient for the initial implementation.

protocol/common/authentication_test.go (2)

354-395: Well-structured test for KES signature size validation.

This test correctly sets up a valid operational certificate with a properly signed cold signature before testing the KES signature size check. This ensures the test fails at the intended validation point rather than an earlier check.


1-396: Comprehensive authentication test coverage.

The tests thoroughly cover the authentication subsystem including:

  • SPO pool registration lifecycle
  • Nil/invalid input handling
  • Certificate and signature size validation
  • KES period rotation anti-replay protection
  • TTL validation edge cases

Good use of time-based assertions with tolerance ranges (lines 233-234).

protocol/messagesubmission/messages.go (1)

1-137: Clean implementation of CIP-0137 Message Submission protocol.

The message types follow a consistent pattern with the other DMQ protocols. Protocol ID 17 and message type constants align with CIP-0137 specification.

protocol/localmessagenotification/messages.go (1)

1-109: Solid implementation of Local Message Notification protocol.

The message types correctly model the CIP-0137 specification with appropriate distinction between blocking and non-blocking reply variants. The HasMore flag in MsgReplyMessagesNonBlocking enables pagination.

protocol/common/authentication.go (4)

312-326: Explicit handling for missing KES verifier is good.

The implementation now properly distinguishes between:

  1. Verifier injected → use it
  2. AllowInsecureKES=true → warn and bypass (for tests)
  3. Neither → fail with clear error

This addresses the previous security concern about silent acceptance of unverified signatures.


102-122: Proper mutex usage for map access.

The SPO pool registration methods correctly use Lock() for writes and RLock() for reads, addressing the previous concurrency concern.


247-266: Wrapped payload encoding now matches spec.

The payload is first CBOR-encoded, then wrapped as a CBOR byte string (bstr .cbor messagePayload). This correctly produces the nested encoding required by CIP-0137.


1-458: Well-structured authentication subsystem.

The implementation covers the key CIP-0137 authentication requirements:

  • Operational certificate verification with ed25519
  • KES signature verification (pluggable)
  • SPO pool registration tracking
  • Anti-replay via opcert number tracking
  • TTL validation

The dependency injection pattern for KES verification allows flexibility without hard-coding ledger dependencies.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (8)
protocol/localmessagesubmission/messages_test.go (1)

54-84: Consider adding field-level assertions for round-trip verification.

The CBOR round-trip test only validates that Type() matches after decode. To ensure byte-identical encoding per the repository's design goals, consider also asserting that decoded message fields (e.g., MessageID, MessageBody) match the original values.

 	parsed, err := NewMsgFromCbor(uint(MessageTypeSubmitMessage), data)
 	assert.NoError(t, err)
 	assert.Equal(t, uint8(MessageTypeSubmitMessage), parsed.Type())
+
+	// Verify decoded fields match original
+	submitMsg, ok := parsed.(*MsgSubmitMessage)
+	assert.True(t, ok)
+	assert.Equal(t, []byte("test-id"), submitMsg.Message.Payload.MessageID)
+	assert.Equal(t, []byte("test-body"), submitMsg.Message.Payload.MessageBody)
protocol/common/authentication_test.go (1)

1-396: Consider adding tests for additional code paths.

The test suite is thorough but could benefit from additional coverage:

  • VerifyMessageWithSlot with explicit slot parameter
  • SetKESVerifier with a mock verifier callback
  • AllowInsecureKES = true bypass path behavior

These are optional additions that would improve confidence in edge cases.

protocol/localmessagenotification/localmessagenotification.go (1)

461-469: WaitForMessage doesn't respect shutdown signal.

If Close() is called while WaitForMessage is blocking, the caller must wait for the full timeout duration. Add l.done to the select to return immediately on shutdown.

 func (l *LocalMessageNotification) WaitForMessage(timeout time.Duration) error {
 	select {
 	case <-l.newMessageSignal:
 		return nil
+	case <-l.done:
+		return errors.New("protocol closed")
 	case <-time.After(timeout):
 		return errors.New("timeout waiting for message")
 	}
 }
protocol/localmessagesubmission/localmessagesubmission.go (1)

154-218: Triple state validation creates complexity without preventing all races.

The code checks state three times (lines 157, 183, 206), releasing and re-acquiring the lock between each check. While this detects concurrent state changes, it doesn't prevent them. If the state changes between lines 191 and 200 (during Send), the local state will still be inconsistent with the peer.

If concurrent Close() or protocol operations are expected during SubmitMessage, consider using an atomic "operation in progress" flag or documenting this race as acceptable (since it returns an error).

protocol/messagesubmission/messages_test.go (1)

163-268: Extend round-trip verification to all message types.

Only MsgRequestMessageIds (lines 189-196) verifies field preservation and byte-identical re-encoding. Apply the same verification pattern to MsgReplyMessageIds, MsgRequestMessages, MsgReplyMessages, and MsgDone to ensure CBOR determinism across all types. Based on learnings

Example for MsgReplyMessageIds:

// After line 215, add field verification:
if decoded, ok := parsed.(*MsgReplyMessageIds); ok {
	assert.Equal(t, len(reply.Messages), len(decoded.Messages))
	assert.Equal(t, reply.Messages[0].MessageID, decoded.Messages[0].MessageID)
	assert.Equal(t, reply.Messages[0].SizeInBytes, decoded.Messages[0].SizeInBytes)
	// Re-encode and verify byte-identical
	reencoded, err := cbor.Encode(decoded)
	assert.NoError(t, err)
	assert.Equal(t, data, reencoded)
}
protocol/messagesubmission/messagesubmission.go (3)

810-828: Document lock requirement for collectAvailableIDs.

This method accesses m.messageQueue and assumes the caller holds m.lock. Add a comment to document this requirement and prevent future misuse.

 // collectAvailableIDs returns up to count message IDs currently queued
+// Caller must hold m.lock.
 func (m *MessageSubmission) collectAvailableIDs(count int) [][]byte {

830-839: Document lock requirement for lookupMessageSize.

This helper also accesses m.messageQueue without lock acquisition. Add a comment documenting that the caller must hold m.lock.

 // lookupMessageSize finds the message size for an ID if present
+// Caller must hold m.lock.
 func (m *MessageSubmission) lookupMessageSize(id []byte) uint32 {

841-863: Document lock requirement for collectMessagesByID.

Like the other collection helpers, this method accesses m.messageQueue and assumes the caller holds m.lock. Add documentation for consistency.

 // collectMessagesByID returns messages matching provided IDs (prunes missing)
+// Caller must hold m.lock.
 func (m *MessageSubmission) collectMessagesByID(
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2af25af and 6c7fd80.

📒 Files selected for processing (12)
  • protocol/common/authentication.go (1 hunks)
  • protocol/common/authentication_test.go (1 hunks)
  • protocol/common/dmq.go (1 hunks)
  • protocol/localmessagenotification/localmessagenotification.go (1 hunks)
  • protocol/localmessagenotification/messages.go (1 hunks)
  • protocol/localmessagenotification/messages_test.go (1 hunks)
  • protocol/localmessagesubmission/localmessagesubmission.go (1 hunks)
  • protocol/localmessagesubmission/messages.go (1 hunks)
  • protocol/localmessagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messages.go (1 hunks)
  • protocol/messagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messagesubmission.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • protocol/localmessagenotification/messages_test.go
  • protocol/localmessagesubmission/messages.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-04T15:54:22.683Z
Learnt from: wolf31o2
Repo: blinklabs-io/gouroboros PR: 1093
File: ledger/mary/pparams.go:143-150
Timestamp: 2025-11-04T15:54:22.683Z
Learning: In the blinklabs-io/gouroboros repository, the design goal for CBOR round-trip tests is to achieve byte-identical encoding WITHOUT using stored CBOR (cbor.DecodeStoreCbor). Instead, the approach uses proper field types (pointers for optional fields) and relies on the cbor package's deterministic encoding (SortCoreDeterministic) to ensure reproducible output. The stored CBOR pattern should not be suggested as a solution for round-trip fidelity in this codebase.

Applied to files:

  • protocol/common/dmq.go
  • protocol/localmessagesubmission/messages_test.go
  • protocol/common/authentication.go
  • protocol/messagesubmission/messages.go
  • protocol/messagesubmission/messages_test.go
🧬 Code graph analysis (4)
protocol/common/dmq.go (1)
cbor/cbor.go (1)
  • StructAsArray (45-48)
protocol/localmessagenotification/messages.go (3)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/localmessagesubmission/messages.go (2)
  • NewMsgFromCbor (61-87)
  • ProtocolName (26-26)
cbor/decode.go (1)
  • Decode (29-44)
protocol/common/authentication_test.go (3)
protocol/common/authentication.go (2)
  • NewMessageAuthenticator (84-96)
  • NewTTLValidator (398-415)
protocol/common/dmq.go (3)
  • Message (53-63)
  • MessagePayload (27-37)
  • OperationalCertificate (40-50)
cbor/encode.go (1)
  • Encode (28-41)
protocol/messagesubmission/messages_test.go (4)
protocol/messagesubmission/messages.go (13)
  • MsgInit (41-43)
  • MessageTypeInit (32-32)
  • MsgRequestMessageIds (46-51)
  • MessageTypeRequestMessageIds (33-33)
  • MsgReplyMessageIds (54-57)
  • MessageTypeReplyMessageIds (34-34)
  • MsgRequestMessages (60-63)
  • MessageTypeRequestMessages (35-35)
  • MsgReplyMessages (66-69)
  • MessageTypeReplyMessages (36-36)
  • MsgDone (72-74)
  • MessageTypeDone (37-37)
  • NewMsgFromCbor (77-105)
protocol/message.go (1)
  • MessageBase (25-30)
cbor/encode.go (1)
  • Encode (28-41)
protocol/localmessagenotification/messages_test.go (1)
  • TestNewMsgFromCborUnknownType (160-164)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (24)
protocol/localmessagesubmission/messages_test.go (3)

27-52: LGTM! Well-structured test for MsgSubmitMessage.

The test properly validates the message type and payload field, and uses appropriate field sizes matching the CIP-0137 specification (448-byte KES signature, 32-byte keys, 64-byte cold signature).


97-157: Good coverage of all reject reason types.

The tests correctly validate all four RejectReason implementations with their expected type codes (0-3), ensuring the interface contract is properly implemented.


170-175: LGTM! Proper error path coverage for unknown message types.

The test correctly validates that NewMsgFromCbor returns an appropriate error for unknown message types.

protocol/common/dmq.go (3)

27-72: CBOR array encoding properly implemented with cbor.StructAsArray embedding.

The DMQ structures now correctly embed cbor.StructAsArray to ensure CBOR array encoding per CIP-0137 specification. This addresses the previous review feedback.


74-79: LGTM! Simple and correct expiration check.

The IsValid() method correctly implements the basic expiration check with appropriate documentation of the uint32 timestamp limitation.


81-124: Well-designed reject reason interface with appropriate implementations.

The RejectReason interface and its four implementations follow Go idioms cleanly. The type codes (0-3) are correctly assigned, and the InvalidReason and OtherReason types appropriately include a Message field for additional context.

protocol/localmessagenotification/messages.go (4)

25-36: LGTM! Protocol constants correctly defined for CIP-0137 Local Message Notification.

Protocol ID 19 and message type constants (0-3) align with the CIP-0137 specification.


38-60: Message structures appropriately defined for blocking/non-blocking semantics.

The MsgReplyMessagesNonBlocking correctly includes HasMore for pagination, while MsgReplyMessagesBlocking omits it as blocking mode waits for messages. Clean design.


62-87: LGTM! Factory function follows established pattern with proper error handling.

The NewMsgFromCbor implementation correctly mirrors the pattern from localmessagesubmission/messages.go, with appropriate error messages including the protocol name for debugging context.


89-107: LGTM! Type() methods correctly implement the protocol.Message interface.

All four Type() methods consistently return their respective message type constants.

protocol/common/authentication.go (6)

34-54: Thread-safety properly addressed with mutex and atomic.Value.

The MessageAuthenticator now correctly uses sync.RWMutex for map access and the defaultKESVerifier uses atomic.Value for thread-safe access, addressing previous review feedback.


56-81: Clean dependency injection pattern for KES verification.

The verifier injection pattern via SetDefaultKESVerifier and ApplyDefaultKESVerifier properly decouples the protocol package from the ledger package while allowing production code to inject real verification.


150-193: Verification flow is well-structured with proper mutex usage.

The 5-step verification process follows a logical order (authentication before authorization), and the read lock on spoPoolIDs check is correctly scoped.


195-237: Operational certificate verification correctly implements ed25519 signature check.

The certificate data is properly CBOR-encoded as an array before verification, and the size constraints (32-byte cold key, 64-byte signature) are enforced.


239-333: KES signature verification now properly handles the security/testing tradeoff.

The implementation correctly:

  1. Wraps the payload in a CBOR bytestring (addressing previous review feedback)
  2. Uses the injected verifier when available
  3. Falls back to explicit insecure bypass with warning log
  4. Hard-fails in production when no verifier is configured and AllowInsecureKES=false

363-386: KES period rotation check correctly prevents replay attacks.

The verifyKESPeriodRotation method properly tracks and enforces that opcert issue numbers never decrease for a given pool, preventing credential replay attacks.

protocol/common/authentication_test.go (4)

28-46: LGTM! Good coverage of authenticator creation and pool registration lifecycle.

Tests correctly validate the public API for SPO pool management including registration, presence check, and unregistration.


102-133: Excellent coverage of KES period rotation anti-replay protection.

The test correctly validates all four scenarios: first registration, equal issue number (allowed), higher number (allowed), and lower number (rejected with appropriate error message).


135-261: Comprehensive TTL validation test coverage.

Tests properly cover all TTL scenarios including defaults, custom values, expired messages, valid messages, messages too far in the future, nil handling, and time-until-expiration calculations.


354-396: Well-designed test isolates KES signature size validation.

The test correctly generates a valid operational certificate (with real ed25519 signature) to ensure the verification fails specifically on the KES signature size check, not on earlier steps.

protocol/localmessagenotification/localmessagenotification.go (1)

628-668: Message collection logic correctly implements blocking vs non-blocking modes.

The implementation properly handles the CIP-0137 specification:

  • Blocking mode consumes only the first available message
  • Non-blocking mode consumes all available messages
  • Expired messages are correctly filtered out
protocol/localmessagesubmission/localmessagesubmission.go (1)

351-382: Handler invocation and state management look correct.

The message handler is called without holding the lock (line 376), which is appropriate since handlers may perform I/O or other blocking operations. The subsequent Accept/Reject submissions properly manage state transitions.

protocol/messagesubmission/messages.go (1)

76-105: Message factory correctly handles all protocol message types.

The NewMsgFromCbor factory properly:

  • Instantiates the correct message type for each CIP-0137 message ID
  • Decodes CBOR payload with error handling
  • Stores raw CBOR for potential re-serialization
  • Returns clear errors for unknown message types
protocol/messagesubmission/messagesubmission.go (1)

656-719: Verify AckCount handling aligns with CIP-0137.

When AckCount > len(pendingMessageIDs) (line 668), the code logs a warning but leaves pendingMessageIDs unchanged (line 673). This may hide protocol violations or peer bugs.

Consider whether the CIP-0137 specification requires:

  1. Treating oversized AckCount as an error (stricter)
  2. Clearing the entire buffer (treat as acknowledgment of all)
  3. Ignoring excess acks with a warning (current behavior)

The current implementation prioritizes robustness over strictness. If this is intentional, the comment should explain the rationale.

@wolf31o2 wolf31o2 force-pushed the feat/dmq-cip-0137 branch 2 times, most recently from e24b502 to 3879be9 Compare December 17, 2025 13:24
@wolf31o2
Copy link
Member Author

@cubic-dev-ai review this PR

@cubic-dev-ai
Copy link
Contributor

cubic-dev-ai bot commented Dec 17, 2025

@cubic-dev-ai review this PR

@wolf31o2 I have started the AI code review. It will take a few minutes to complete.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
protocol/localmessagesubmission/messages_test.go (1)

54-84: Enhance CBOR round-trip test to verify field preservation and byte-identical encoding.

The round-trip test only checks for errors and Type() but doesn't verify that all fields (Message.Payload.MessageID, Message.Payload.MessageBody, KESSignature, etc.) are preserved after decoding, nor does it confirm byte-identical re-encoding. Based on learnings, this codebase aims for deterministic CBOR encoding.

Add assertions after decoding to verify field equality and byte-identical re-encoding:

 	parsed, err := NewMsgFromCbor(uint(MessageTypeSubmitMessage), data)
 	assert.NoError(t, err)
 	assert.Equal(t, uint8(MessageTypeSubmitMessage), parsed.Type())
+	
+	parsedMsg := parsed.(*MsgSubmitMessage)
+	assert.Equal(t, msg.Message.Payload.MessageID, parsedMsg.Message.Payload.MessageID)
+	assert.Equal(t, msg.Message.Payload.MessageBody, parsedMsg.Message.Payload.MessageBody)
+	assert.Equal(t, msg.Message.Payload.KESPeriod, parsedMsg.Message.Payload.KESPeriod)
+	assert.Equal(t, msg.Message.Payload.ExpiresAt, parsedMsg.Message.Payload.ExpiresAt)
+	
+	// Verify byte-identical re-encoding
+	data2, err := cbor.Encode(parsedMsg)
+	assert.NoError(t, err)
+	assert.Equal(t, data, data2, "re-encoded bytes should match original")
protocol/localmessagesubmission/messages.go (1)

82-85: Redundant nil check in NewMsgFromCbor.

The if ret != nil check is redundant because ret is guaranteed to be non-nil after the switch statement. The switch either assigns a concrete message struct to ret or returns an error (lines 73-77), so the nil check will always be true.

 	if _, err := cbor.Decode(data, ret); err != nil {
 		return nil, fmt.Errorf("%s: decode error: %w", ProtocolName, err)
 	}
-	if ret != nil {
-		// Store the raw message CBOR
-		ret.SetCbor(data)
-	}
+	// Store the raw message CBOR
+	ret.SetCbor(data)
 	return ret, nil
protocol/messagesubmission/messagesubmission.go (2)

810-863: Consider index optimization for large queues.

lookupMessageSize (lines 831-839) has O(n) complexity and is called for each ID in handleRequestMessageIds (line 706), resulting in O(n×m) overall complexity.

collectMessagesByID (lines 842-863) also has O(n×m) complexity where n is the number of requested IDs and m is the queue size.

For the expected queue sizes (default 100), this is acceptable. If queue sizes grow significantly or message lookups become a bottleneck, consider maintaining a map[string]*pcommon.Message index for O(1) lookups.


154-170: Remove unused acknowledgedMessageIDs field.

The acknowledgedMessageIDs map is initialized but never read or written elsewhere in the codebase. Removing this unused field will reduce memory overhead.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c7fd80 and e24b502.

📒 Files selected for processing (19)
  • ledger/allegra/allegra.go (2 hunks)
  • ledger/alonzo/alonzo.go (1 hunks)
  • ledger/babbage/babbage.go (1 hunks)
  • ledger/common/metadata.go (2 hunks)
  • ledger/conway/conway.go (1 hunks)
  • ledger/mary/mary.go (2 hunks)
  • ledger/shelley/shelley.go (2 hunks)
  • protocol/common/authentication.go (1 hunks)
  • protocol/common/authentication_test.go (1 hunks)
  • protocol/common/dmq.go (1 hunks)
  • protocol/localmessagenotification/localmessagenotification.go (1 hunks)
  • protocol/localmessagenotification/messages.go (1 hunks)
  • protocol/localmessagenotification/messages_test.go (1 hunks)
  • protocol/localmessagesubmission/localmessagesubmission.go (1 hunks)
  • protocol/localmessagesubmission/messages.go (1 hunks)
  • protocol/localmessagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messages.go (1 hunks)
  • protocol/messagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messagesubmission.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • protocol/messagesubmission/messages_test.go
  • protocol/localmessagenotification/messages.go
  • protocol/common/dmq.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-04T15:54:22.683Z
Learnt from: wolf31o2
Repo: blinklabs-io/gouroboros PR: 1093
File: ledger/mary/pparams.go:143-150
Timestamp: 2025-11-04T15:54:22.683Z
Learning: In the blinklabs-io/gouroboros repository, the design goal for CBOR round-trip tests is to achieve byte-identical encoding WITHOUT using stored CBOR (cbor.DecodeStoreCbor). Instead, the approach uses proper field types (pointers for optional fields) and relies on the cbor package's deterministic encoding (SortCoreDeterministic) to ensure reproducible output. The stored CBOR pattern should not be suggested as a solution for round-trip fidelity in this codebase.

Applied to files:

  • ledger/allegra/allegra.go
  • ledger/alonzo/alonzo.go
  • protocol/localmessagesubmission/messages_test.go
  • ledger/babbage/babbage.go
  • protocol/messagesubmission/messages.go
  • ledger/shelley/shelley.go
  • protocol/common/authentication.go
  • ledger/conway/conway.go
  • ledger/mary/mary.go
🧬 Code graph analysis (7)
protocol/localmessagenotification/messages_test.go (2)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/localmessagesubmission/messages.go (1)
  • NewMsgFromCbor (61-87)
ledger/common/metadata.go (1)
cbor/value.go (1)
  • Value (29-33)
protocol/common/authentication_test.go (3)
protocol/common/authentication.go (4)
  • NewMessageAuthenticator (85-97)
  • NewTTLValidator (399-416)
  • SetDefaultKESVerifier (64-68)
  • ApplyDefaultKESVerifier (72-82)
protocol/common/dmq.go (3)
  • Message (53-63)
  • MessagePayload (27-37)
  • OperationalCertificate (40-50)
cbor/encode.go (1)
  • Encode (28-41)
protocol/messagesubmission/messages.go (3)
protocol/message.go (1)
  • MessageBase (25-30)
protocol/common/dmq.go (2)
  • MessageIDAndSize (66-72)
  • Message (53-63)
cbor/decode.go (1)
  • Decode (29-44)
protocol/common/authentication.go (3)
cbor/value.go (1)
  • Value (29-33)
protocol/common/dmq.go (2)
  • Message (53-63)
  • OperationalCertificate (40-50)
cbor/encode.go (1)
  • Encode (28-41)
protocol/localmessagesubmission/localmessagesubmission.go (3)
protocol/state.go (3)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • StateTransition (51-55)
protocol/localmessagesubmission/messages.go (7)
  • MessageTypeSubmitMessage (32-32)
  • MessageTypeDone (35-35)
  • MessageTypeAcceptMessage (33-33)
  • MessageTypeRejectMessage (34-34)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (61-87)
protocol/messagesubmission/messages.go (4)
  • MessageTypeDone (37-37)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (77-105)
protocol/messagesubmission/messagesubmission.go (2)
protocol/messagesubmission/messages.go (15)
  • MessageTypeInit (32-32)
  • MessageTypeRequestMessageIds (33-33)
  • MsgRequestMessageIds (46-51)
  • MessageTypeRequestMessages (35-35)
  • MessageTypeDone (37-37)
  • MessageTypeReplyMessageIds (34-34)
  • MessageTypeReplyMessages (36-36)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (77-105)
  • MsgInit (41-43)
  • MsgReplyMessageIds (54-57)
  • MsgRequestMessages (60-63)
  • MsgReplyMessages (66-69)
  • MsgDone (72-74)
protocol/localmessagesubmission/messages.go (5)
  • MessageTypeDone (35-35)
  • ProtocolName (26-26)
  • ProtocolID (27-27)
  • NewMsgFromCbor (61-87)
  • MsgDone (56-58)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (30)
ledger/alonzo/alonzo.go (1)

870-873: LGTM—defensive nil guard improves robustness.

The nil receiver check prevents potential panics when Cbor() is called on a nil *AlonzoTransaction. This aligns with the broader defensive pattern applied consistently across ledger transaction types in this PR.

ledger/mary/mary.go (2)

320-321: LGTM—readability improvement.

Line wrapping the multi-condition check improves readability without altering the logic.


490-492: LGTM—defensive nil guard improves robustness.

The nil receiver check prevents potential panics when Cbor() is called on a nil *MaryTransaction, consistent with the defensive pattern applied across other ledger transaction types.

ledger/babbage/babbage.go (1)

1109-1111: LGTM—defensive nil guard improves robustness.

The nil receiver check prevents potential panics when Cbor() is called on a nil *BabbageTransaction, maintaining consistency with the defensive pattern across ledger transaction types.

ledger/conway/conway.go (1)

833-835: LGTM—defensive nil guard improves robustness.

The nil receiver check prevents potential panics when Cbor() is called on a nil *ConwayTransaction, completing the consistent defensive pattern across all ledger eras.

ledger/allegra/allegra.go (2)

287-288: LGTM—readability improvement.

Line wrapping the multi-condition check improves readability without altering the logic.


465-467: LGTM—defensive nil guard improves robustness.

The nil receiver check prevents potential panics when Cbor() is called on a nil *AllegraTransaction, consistent with the defensive pattern applied across ledger transaction types.

ledger/shelley/shelley.go (2)

601-602: LGTM—readability improvement.

Line wrapping the multi-condition check improves readability without altering the logic.


776-778: LGTM—defensive nil guard improves robustness.

The nil receiver check prevents potential panics when Cbor() is called on a nil *ShelleyTransaction, establishing a consistent defensive pattern across all ledger eras from Shelley through Conway.

ledger/common/metadata.go (1)

21-21: LGTM! Good refactoring for clarity.

Using slices.Clone instead of the manual append([]byte(nil), bs...) pattern improves code readability while maintaining identical semantics. Both approaches create a distinct copy of the byte slice, but the standard library function makes the intent explicit.

Also applies to: 255-255

protocol/localmessagenotification/messages_test.go (1)

25-164: LGTM! Test coverage is appropriate.

The tests correctly verify message types, field values, and error handling for unknown message types. The test structure aligns well with the message definitions in the corresponding messages.go file.

protocol/common/authentication_test.go (1)

29-534: LGTM! Comprehensive authentication test coverage.

The test suite thoroughly exercises the MessageAuthenticator and TTLValidator APIs including:

  • Pool registration and verification
  • KES signature verification with custom verifiers
  • TTL validation edge cases (expired, too far in future)
  • Operational certificate verification
  • Default verifier injection patterns

The buildTestMessage helper is well-designed for reuse across tests.

protocol/messagesubmission/messages.go (1)

76-105: LGTM! Clean message factory implementation.

The NewMsgFromCbor factory correctly handles all message types with proper error reporting for unknown types. The CBOR decoding and raw message storage pattern is consistent with the codebase conventions.

protocol/common/authentication.go (3)

32-97: LGTM! Excellent authentication infrastructure with proper concurrency handling.

The MessageAuthenticator implementation demonstrates solid design:

  • Thread-safe map access via sync.RWMutex
  • Package-level default verifier using atomic.Value for race-free concurrent access
  • Clear separation between secure (with verifier) and test/dev modes (AllowInsecureKES)
  • Proper initialization with sensible defaults

240-334: LGTM! Well-designed pluggable KES verification.

The KES signature verification correctly:

  • Validates signature and key sizes
  • Encodes payload as CBOR byte string per spec (bstr .cbor messagePayload)
  • Supports pluggable verifier injection to avoid hard dependencies
  • Provides secure default behavior (error when no verifier) with explicit opt-in for test bypass
  • Logs warnings when insecure bypass is enabled

This design allows production deployments to inject real KES verification while enabling tests to run without it.


418-457: LGTM! Overflow protection in TTL validation.

The ValidateMessageTTL correctly prevents uint32 overflow by:

  • Computing maxAllowedExpiration in uint64 (lines 439-440)
  • Clamping to math.MaxUint32 (lines 441-443)
  • Safely casting back to uint32 after clamping (line 446)

This ensures that even misconfigured large TTL values won't cause arithmetic overflow or incorrect validation.

protocol/localmessagenotification/localmessagenotification.go (4)

194-231: LGTM! AddMessage correctly avoids deadlock.

The implementation properly reads newMessageSignalClosed directly at line 223 while holding the lock acquired at line 196 (with deferred unlock at line 197). This avoids the deadlock that would occur from attempting to re-lock the mutex and correctly signals new messages while preventing races with Close().


233-309: LGTM! State transitions follow correct pattern.

Both SubmitRequestMessagesNonBlocking and SubmitRequestMessagesBlocking correctly:

  1. Check state and release lock before sending
  2. Call Send() and return early on error (preserving the original state)
  3. Re-check state and only transition if still StateIdle

This pattern ensures the local state remains consistent with the peer state even when network errors occur during Send().


421-458: LGTM! Messages are correctly consumed and acknowledgedIDs is bounded.

The GetAvailableMessages implementation correctly:

  • Consumes messages by adding them to available OR keeping them in filtered, not both (lines 443-448)
  • Clears the acknowledgedIDs map at line 453 to prevent unbounded memory growth
  • Properly handles expired messages by skipping them without adding to filtered

This addresses past review concerns about message duplication and memory leaks.


671-703: LGTM! Close properly guards against races and double-close panics.

The Close() implementation uses:

  • select with l.done to prevent double-close (lines 677-682)
  • newMessageSignalClosed flag set while holding lock (line 686) to coordinate with message enqueueing
  • Channel close after setting the flag to prevent sends-on-closed-channel panics

This is a robust pattern for graceful shutdown.

protocol/localmessagesubmission/localmessagesubmission.go (2)

153-218: LGTM! SubmitMessage correctly handles validation and state transitions.

The implementation properly:

  1. Performs TTL and authentication validation outside the lock (lines 167-179) to avoid holding the lock during potentially expensive operations
  2. Re-checks state before sending (lines 182-191)
  3. Sends the message and only transitions to StateBusy after successful send (lines 200-217)

This pattern ensures state consistency even when validation or send operations fail.


220-316: LGTM! State transitions occur after successful sends.

All submission methods (SubmitAcceptMessage, SubmitRejectMessage, SubmitDone) correctly:

  • Check current state
  • Call Send() and return early on error
  • Only transition state after successful send

This pattern prevents state desynchronization between client and server when network errors occur, addressing concerns from past reviews.

protocol/messagesubmission/messagesubmission.go (8)

28-152: LGTM! State machine is well-defined.

The protocol state definitions, timeouts, and transition map correctly implement CIP-0137. The use of MatchFunc to distinguish blocking vs non-blocking requests based on the IsBlocking field is appropriate.


181-233: LGTM! Constructor properly initializes the protocol.

Default values are sensible, and the protocol is correctly started after initialization.


235-263: LGTM! Message validation is comprehensive.

The validation sequence (nil check → TTL → authentication → capacity) is appropriate, and the fallback to msg.IsValid() when TTLValidator is nil is a good defensive measure.


265-579: LGTM! All Submit methods follow consistent send-before-state pattern.

All public submission methods correctly:

  1. Validate state before sending
  2. Send the message
  3. Re-check state and transition only after successful send

This addresses the previous review concern about state desynchronization on send failures.


581-614: LGTM! Message delivery correctly removes items from queue.

The filtering logic properly removes delivered messages while retaining non-expired, undelivered ones. The fix from the previous review (lines 602-608) correctly implements this behavior.


770-808: LGTM! Message validation and termination handlers are correct.

handleReplyMessages properly validates TTL and authentication before enqueuing, with appropriate error handling. handleDone correctly transitions to terminal state.


865-874: LGTM! Close method properly cleans up resources.

State transition, protocol stop, and queue cleanup are correctly sequenced under the lock.


667-675: The current implementation defensively handles oversized AckCount by logging a warning without modifying the buffer. CIP-0137 does not specify behavior for this edge case, so the conservative approach is reasonable. Consider whether this scenario represents a legitimate protocol edge case requiring buffer clearing, or if the current behavior is appropriate for detecting peer anomalies. Verify this design choice against actual deployment experience with remote peers.

Comment on lines 742 to 775
// handleRequestMessages processes message requests
func (m *MessageSubmission) handleRequestMessages(
req *MsgRequestMessages,
) error {
m.lock.Lock()
if m.state != StateIdle {
m.lock.Unlock()
return fmt.Errorf("invalid state for request messages: %v", m.state)
}
m.state = StateMessages

// Collect messages while holding lock to prevent data race
messages := m.collectMessagesByID(req.MessageIDs)

// Transition back to idle while still holding lock to avoid races
m.state = StateIdle
m.lock.Unlock()

reply := &MsgReplyMessages{
MessageBase: protocol.MessageBase{
MessageType: MessageTypeReplyMessages,
},
Messages: messages,
}

return m.sendFn(reply)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

State transition before send creates desynchronization risk.

Lines 751 and 757 transition state to StateMessages and back to StateIdle before the reply is sent at line 767. If sendFn fails, the local state will be StateIdle but the peer never received the messages.

For consistency with the Submit* methods and to prevent state desynchronization, transition to StateMessages, collect messages, send the reply, and only transition to StateIdle after successful send.

Apply this pattern:

 	m.lock.Lock()
 	if m.state != StateIdle {
 		m.lock.Unlock()
 		return fmt.Errorf("invalid state for request messages: %v", m.state)
 	}
 	m.state = StateMessages
 
 	// Collect messages while holding lock to prevent data race
 	messages := m.collectMessagesByID(req.MessageIDs)
-
-	// Transition back to idle while still holding lock to avoid races
-	m.state = StateIdle
 	m.lock.Unlock()
 
 	reply := &MsgReplyMessages{
 		MessageBase: protocol.MessageBase{
 			MessageType: MessageTypeReplyMessages,
 		},
 		Messages: messages,
 	}
 
-	return m.sendFn(reply)
+	if err := m.sendFn(reply); err != nil {
+		return err
+	}
+	
+	m.lock.Lock()
+	m.state = StateIdle
+	m.lock.Unlock()
+	return nil
🤖 Prompt for AI Agents
In protocol/messagesubmission/messagesubmission.go around lines 742–768, change
the state transition flow so you set m.state = StateMessages, collect messages,
perform the send, and only after a successful send set m.state = StateIdle;
specifically, acquire the lock, verify state, set StateMessages, collect
messages, call m.sendFn(reply) while still holding the lock (or otherwise ensure
atomicity of state+send), and if send returns an error do not reset state to
StateIdle (return the error), but on success set StateIdle and then unlock;
ensure you preserve lock/unlock pairing and return the send error unchanged on
failure.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
protocol/messagesubmission/messagesubmission.go (2)

711-718: State transition before send creates desynchronization risk.

Lines 711-715 transition state before calling sendFn at line 718. If the send fails, the local state will be StateMessageIdsBlocking or StateMessageIdsNonBlocking but the peer never received the reply, causing protocol desynchronization.

Apply the same send-before-state pattern used in the Submit methods:

 	}
 
-	if req.IsBlocking {
-		m.state = StateMessageIdsBlocking
-	} else {
-		m.state = StateMessageIdsNonBlocking
-	}
 	m.lock.Unlock()
 
-	return m.sendFn(reply)
+	if err := m.sendFn(reply); err != nil {
+		return err
+	}
+	
+	m.lock.Lock()
+	if req.IsBlocking {
+		m.state = StateMessageIdsBlocking
+	} else {
+		m.state = StateMessageIdsNonBlocking
+	}
+	m.lock.Unlock()
+	return nil

742-768: State transitions before send create desynchronization risk.

Lines 751 and 757 transition state to StateMessages and back to StateIdle before sending the reply at line 767. If the send fails:

  • Local state will be StateIdle
  • Peer never received the requested messages
  • Peer may retry or proceed incorrectly

Apply the same send-before-state pattern:

 	m.lock.Lock()
 	if m.state != StateIdle {
 		m.lock.Unlock()
 		return fmt.Errorf("invalid state for request messages: %v", m.state)
 	}
 	m.state = StateMessages
 
 	// Collect messages while holding lock to prevent data race
 	messages := m.collectMessagesByID(req.MessageIDs)
-
-	// Transition back to idle while still holding lock to avoid races
-	m.state = StateIdle
 	m.lock.Unlock()
 
 	reply := &MsgReplyMessages{
 		MessageBase: protocol.MessageBase{
 			MessageType: MessageTypeReplyMessages,
 		},
 		Messages: messages,
 	}
 
-	return m.sendFn(reply)
+	if err := m.sendFn(reply); err != nil {
+		return err
+	}
+	
+	m.lock.Lock()
+	m.state = StateIdle
+	m.lock.Unlock()
+	return nil

Note: The past review comment about a data race at line 754 is no longer valid, as the lock is properly held during collectMessagesByID.

🧹 Nitpick comments (1)
protocol/messagesubmission/messagesubmission.go (1)

830-863: Consider using bytes.Equal for ID comparisons.

Lines 833 and 856 use string(id) conversions for byte slice comparison. While correct, this creates temporary string allocations.

Consider using bytes.Equal for direct comparison:

+import "bytes"

 func (m *MessageSubmission) lookupMessageSize(id []byte) uint32 {
 	for _, msg := range m.messageQueue {
-		if msg != nil && string(msg.Payload.MessageID) == string(id) {
+		if msg != nil && bytes.Equal(msg.Payload.MessageID, id) {
 			// #nosec G115 -- message body size bounded by protocol limits
 			return uint32(len(msg.Payload.MessageBody))
 		}
 	}
 	return 0
 }

 func (m *MessageSubmission) collectMessagesByID(
 	ids [][]byte,
 ) []pcommon.Message {
 	var messages []pcommon.Message
 	now := time.Now().Unix()
 	for _, id := range ids {
 		for _, msg := range m.messageQueue {
 			if msg == nil {
 				continue
 			}
 			// #nosec G115 -- Unix timestamp will not overflow uint32 until year 2106
 			if uint32(now) > msg.Payload.ExpiresAt {
 				continue
 			}
-			if string(msg.Payload.MessageID) == string(id) {
+			if bytes.Equal(msg.Payload.MessageID, id) {
 				messages = append(messages, *msg)
 				break
 			}
 		}
 	}
 	return messages
 }
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e24b502 and 3879be9.

📒 Files selected for processing (12)
  • protocol/common/authentication.go (1 hunks)
  • protocol/common/authentication_test.go (1 hunks)
  • protocol/common/dmq.go (1 hunks)
  • protocol/localmessagenotification/localmessagenotification.go (1 hunks)
  • protocol/localmessagenotification/messages.go (1 hunks)
  • protocol/localmessagenotification/messages_test.go (1 hunks)
  • protocol/localmessagesubmission/localmessagesubmission.go (1 hunks)
  • protocol/localmessagesubmission/messages.go (1 hunks)
  • protocol/localmessagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messages.go (1 hunks)
  • protocol/messagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messagesubmission.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • protocol/localmessagenotification/messages_test.go
  • protocol/common/dmq.go
  • protocol/localmessagenotification/messages.go
  • protocol/common/authentication_test.go
  • protocol/localmessagesubmission/messages.go
  • protocol/localmessagenotification/localmessagenotification.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-04T15:54:22.683Z
Learnt from: wolf31o2
Repo: blinklabs-io/gouroboros PR: 1093
File: ledger/mary/pparams.go:143-150
Timestamp: 2025-11-04T15:54:22.683Z
Learning: In the blinklabs-io/gouroboros repository, the design goal for CBOR round-trip tests is to achieve byte-identical encoding WITHOUT using stored CBOR (cbor.DecodeStoreCbor). Instead, the approach uses proper field types (pointers for optional fields) and relies on the cbor package's deterministic encoding (SortCoreDeterministic) to ensure reproducible output. The stored CBOR pattern should not be suggested as a solution for round-trip fidelity in this codebase.

Applied to files:

  • protocol/common/authentication.go
  • protocol/localmessagesubmission/messages_test.go
  • protocol/messagesubmission/messages_test.go
  • protocol/messagesubmission/messages.go
🧬 Code graph analysis (4)
protocol/common/authentication.go (3)
cbor/value.go (1)
  • Value (29-33)
protocol/common/dmq.go (2)
  • Message (53-63)
  • OperationalCertificate (40-50)
cbor/encode.go (1)
  • Encode (28-41)
protocol/localmessagesubmission/messages_test.go (3)
protocol/common/dmq.go (6)
  • Message (53-63)
  • MessagePayload (27-37)
  • InvalidReason (89-92)
  • AlreadyReceivedReason (100-100)
  • ExpiredReason (108-108)
  • OtherReason (116-119)
cbor/encode.go (1)
  • Encode (28-41)
protocol/messagesubmission/messages.go (3)
  • NewMsgFromCbor (77-105)
  • MsgDone (72-74)
  • MessageTypeDone (37-37)
protocol/messagesubmission/messages_test.go (2)
protocol/messagesubmission/messages.go (13)
  • MsgInit (41-43)
  • MessageTypeInit (32-32)
  • MsgRequestMessageIds (46-51)
  • MessageTypeRequestMessageIds (33-33)
  • MsgReplyMessageIds (54-57)
  • MessageTypeReplyMessageIds (34-34)
  • MsgRequestMessages (60-63)
  • MessageTypeRequestMessages (35-35)
  • MsgReplyMessages (66-69)
  • MessageTypeReplyMessages (36-36)
  • MsgDone (72-74)
  • MessageTypeDone (37-37)
  • NewMsgFromCbor (77-105)
protocol/localmessagenotification/messages_test.go (1)
  • TestNewMsgFromCborUnknownType (160-164)
protocol/messagesubmission/messagesubmission.go (2)
protocol/state.go (3)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • StateTransition (51-55)
protocol/messagesubmission/messages.go (14)
  • MessageTypeInit (32-32)
  • MessageTypeRequestMessageIds (33-33)
  • MsgRequestMessageIds (46-51)
  • MessageTypeRequestMessages (35-35)
  • MessageTypeDone (37-37)
  • MessageTypeReplyMessageIds (34-34)
  • MessageTypeReplyMessages (36-36)
  • ProtocolName (26-26)
  • NewMsgFromCbor (77-105)
  • MsgInit (41-43)
  • MsgReplyMessageIds (54-57)
  • MsgRequestMessages (60-63)
  • MsgReplyMessages (66-69)
  • MsgDone (72-74)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: cubic · AI code reviewer
🔇 Additional comments (27)
protocol/common/authentication.go (10)

1-30: LGTM!

Imports are appropriate for the authentication functionality. Uses sync and sync/atomic for thread-safety, crypto/ed25519 for signature verification, and golang.org/x/crypto/blake2b for pool ID hashing.


32-55: LGTM!

The struct properly addresses concurrency concerns with sync.RWMutex for map protection. The AllowInsecureKES flag is clearly documented as non-production, and the pluggable kesVerifier callback provides flexibility.


57-82: LGTM!

Thread-safe access to the package-level default KES verifier using atomic.Value with proper type assertion and nil checks.


84-97: LGTM!

Constructor initializes maps, sets secure defaults (AllowInsecureKES defaults to false), and uses the Cardano mainnet standard for slotsPerKesPeriod.


99-129: LGTM!

Proper use of sync.RWMutex with write locks (Lock) for registration/unregistration and read locks (RLock) for queries.


131-194: LGTM!

Comprehensive message verification with proper sequencing. The pool registration check correctly uses RLock for concurrent read access.


196-238: LGTM!

Operational certificate verification correctly validates sizes and uses ed25519.Verify against the CBOR-encoded certificate data.


240-334: LGTM!

KES signature verification now properly:

  1. Wraps payload as bstr .cbor messagePayload per spec (encoding payloadCbor directly)
  2. Uses pluggable verifier when available
  3. Explicitly logs a warning when insecure bypass is enabled
  4. Fails securely when no verifier is available and AllowInsecureKES is false

336-387: LGTM!

Message ID validation, pool ID computation (blake2b-256), and KES period rotation checks are correctly implemented. The rotation check properly uses a write lock since it updates the cache.


389-473: LGTM!

TTL validation correctly:

  1. Defaults to 30 minutes per CIP-0137
  2. Uses uint64 arithmetic to prevent overflow
  3. Clamps to math.MaxUint32 before converting back to uint32
  4. Includes appropriate #nosec annotations for the safe conversions
protocol/localmessagesubmission/localmessagesubmission.go (7)

28-85: LGTM!

State machine and state map are well-defined with appropriate timeouts per CIP-0137:

  • IdleTimeout: 300s for client to submit
  • BusyTimeout: 30s for server reply
  • DoneTimeout: 10s for cleanup

87-151: LGTM!

Clean struct definition and constructor with proper defaults. The constructor correctly applies the default KES verifier when creating a new authenticator internally.


153-218: LGTM!

The state management pattern correctly addresses past review concerns:

  1. Validates message before committing to state change
  2. Re-verifies state after validation in case of concurrent changes
  3. Only transitions to StateBusy after successful Send
  4. Final state check at line 206 catches any concurrent modifications during the send operation

220-279: LGTM!

Both methods correctly transition state to StateIdle only after successful Send, addressing the past review concern about state desynchronization on network failure.


281-316: LGTM!

Correctly verifies state remains StateIdle after sending before transitioning to StateDone, preventing state corruption from concurrent operations.


318-415: LGTM!

Message handlers correctly implement the state machine:

  • handleSubmitMessage validates before state change, transitions to StateBusy, and auto-responds
  • Accept/Reject handlers verify StateBusy and transition to StateIdle
  • handleDone verifies StateIdle and properly stops the protocol

417-425: LGTM!

Force-close method properly sets terminal state and stops the protocol.

protocol/messagesubmission/messages_test.go (2)

1-53: LGTM!

Basic encoding tests properly verify message type accessors and field values.


54-268: LGTM!

Comprehensive test coverage:

  • All message types have encoding tests
  • TestNewMsgFromCborInit properly exercises the CBOR parsing path (addressing past review)
  • TestNewMsgFromCborRoundTrip verifies field preservation and deterministic encoding for MsgRequestMessageIds (lines 189-196)
  • Error handling for unknown message types is tested
protocol/localmessagesubmission/messages_test.go (2)

1-85: LGTM!

Good test coverage for MsgSubmitMessage including type verification, field checks, and CBOR round-trip encoding/decoding.


86-175: LGTM!

Comprehensive testing of all RejectReason variants (InvalidReason, AlreadyReceivedReason, ExpiredReason, OtherReason) with correct type code assertions. The unknown message type error case is properly tested.

protocol/messagesubmission/messages.go (3)

15-38: LGTM!

Clean protocol constants and message type definitions following CIP-0137 specification. Protocol ID 17 is correctly assigned.


40-74: LGTM!

Message structs correctly model the CIP-0137 protocol with appropriate field types. The embedded protocol.MessageBase provides CBOR array serialization.


76-135: LGTM!

Clean factory function with proper error handling for unknown types and decode failures. Type methods correctly return the message type constants.

protocol/messagesubmission/messagesubmission.go (3)

266-579: Submit methods correctly implement send-before-state-transition pattern.

All Submit methods now properly send the message before transitioning state, with appropriate state rechecks to detect concurrent modifications. This addresses the concerns from previous reviews and ensures state consistency even when sends fail.


581-614: GetAvailableMessages correctly removes delivered messages from queue.

The logic properly removes expired messages and delivered messages from the queue while retaining undelivered valid messages. The previous issue regarding message retention has been resolved.


667-675: Verify protocol semantics for oversized AckCount against CIP-0137 specification.

When AckCount exceeds the number of pending message IDs, the code logs a warning but continues processing (leaving pendingMessageIDs unchanged). This differs from other protocol violations in the same function which return errors. Confirm whether CIP-0137 specifies whether oversized AckCount should:

  • Terminate the protocol with an error
  • Be handled gracefully (current behavior)

Add test coverage for oversized AckCount scenarios if validation is confirmed.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 12 files

Prompt for AI agents (all 6 issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="protocol/localmessagesubmission/messages_test.go">

<violation number="1" location="protocol/localmessagesubmission/messages_test.go:81">
P2: Round-trip test only verifies message type, not content. The test encodes a `MsgSubmitMessage` with payload data but after decoding only checks `parsed.Type()`. A proper round-trip test should type-assert the result back to `*MsgSubmitMessage` and verify the fields (e.g., `MessageID`, `MessageBody`, etc.) match the original, similar to how `localtxsubmission/messages_test.go` uses `reflect.DeepEqual`.</violation>
</file>

<file name="protocol/localmessagenotification/messages_test.go">

<violation number="1" location="protocol/localmessagenotification/messages_test.go:51">
P2: Misleading test comment claims to test &#39;encoding&#39; but the test only verifies struct field values. No actual CBOR encoding/decoding is performed. Either update the comment to accurately describe what&#39;s being tested (e.g., &#39;tests MsgReplyMessagesNonBlocking struct creation&#39;) or add actual encoding tests following the pattern in protocol/blockfetch/messages_test.go.</violation>
</file>

<file name="protocol/localmessagesubmission/localmessagesubmission.go">

<violation number="1" location="protocol/localmessagesubmission/localmessagesubmission.go:215">
P1: Race condition: state is set to `StateBusy` after `Send()` returns, but the async response handler (`handleAcceptMessage`/`handleRejectMessage`) expects `StateBusy` and may execute before this line runs. Set state to `StateBusy` before calling `Send()` and revert on failure.</violation>

<violation number="2" location="protocol/localmessagesubmission/localmessagesubmission.go:395">
P2: The `msg` parameter is unused - the rejection reason is silently discarded. Consider logging the reason or providing a callback so clients can learn why their message was rejected.</violation>
</file>

<file name="protocol/messagesubmission/messagesubmission.go">

<violation number="1" location="protocol/messagesubmission/messagesubmission.go:541">
P2: `SubmitDone()` allows sending Done from `StateMessageIdsNonBlocking`, but the state machine does not define this transition for `protocolStateMessageIdsNonBlk`. Only `protocolStateMessageIdsBlock` and `protocolStateIdle` have valid transitions for `MessageTypeDone`. This mismatch could cause protocol framework errors.</violation>
</file>

<file name="protocol/localmessagenotification/localmessagenotification.go">

<violation number="1" location="protocol/localmessagenotification/localmessagenotification.go:586">
P2: Missing state validation in `handleClientDone`. Per the state machine definition, `ClientDone` should only be valid from `StateIdle`, but this handler doesn&#39;t verify the current state before transitioning to `StateDone`.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

data, err := cbor.Encode(msg)
assert.NoError(t, err)

parsed, err := NewMsgFromCbor(uint(MessageTypeSubmitMessage), data)
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Round-trip test only verifies message type, not content. The test encodes a MsgSubmitMessage with payload data but after decoding only checks parsed.Type(). A proper round-trip test should type-assert the result back to *MsgSubmitMessage and verify the fields (e.g., MessageID, MessageBody, etc.) match the original, similar to how localtxsubmission/messages_test.go uses reflect.DeepEqual.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/localmessagesubmission/messages_test.go, line 81:

<comment>Round-trip test only verifies message type, not content. The test encodes a `MsgSubmitMessage` with payload data but after decoding only checks `parsed.Type()`. A proper round-trip test should type-assert the result back to `*MsgSubmitMessage` and verify the fields (e.g., `MessageID`, `MessageBody`, etc.) match the original, similar to how `localtxsubmission/messages_test.go` uses `reflect.DeepEqual`.</comment>

<file context>
@@ -0,0 +1,175 @@
+	data, err := cbor.Encode(msg)
+	assert.NoError(t, err)
+
+	parsed, err := NewMsgFromCbor(uint(MessageTypeSubmitMessage), data)
+	assert.NoError(t, err)
+	assert.Equal(t, uint8(MessageTypeSubmitMessage), parsed.Type())
</file context>
Fix with Cubic

assert.Equal(t, true, msg.IsBlocking)
}

// TestMsgReplyMessagesNonBlocking tests MsgReplyMessagesNonBlocking encoding
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Misleading test comment claims to test 'encoding' but the test only verifies struct field values. No actual CBOR encoding/decoding is performed. Either update the comment to accurately describe what's being tested (e.g., 'tests MsgReplyMessagesNonBlocking struct creation') or add actual encoding tests following the pattern in protocol/blockfetch/messages_test.go.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/localmessagenotification/messages_test.go, line 51:

<comment>Misleading test comment claims to test &#39;encoding&#39; but the test only verifies struct field values. No actual CBOR encoding/decoding is performed. Either update the comment to accurately describe what&#39;s being tested (e.g., &#39;tests MsgReplyMessagesNonBlocking struct creation&#39;) or add actual encoding tests following the pattern in protocol/blockfetch/messages_test.go.</comment>

<file context>
@@ -0,0 +1,164 @@
+	assert.Equal(t, true, msg.IsBlocking)
+}
+
+// TestMsgReplyMessagesNonBlocking tests MsgReplyMessagesNonBlocking encoding
+func TestMsgReplyMessagesNonBlocking(t *testing.T) {
+	messages := []pcommon.Message{
</file context>

✅ Addressed in b3a9518

}

func (l *LocalMessageSubmission) handleRejectMessage(
msg *MsgRejectMessage,
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The msg parameter is unused - the rejection reason is silently discarded. Consider logging the reason or providing a callback so clients can learn why their message was rejected.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/localmessagesubmission/localmessagesubmission.go, line 395:

<comment>The `msg` parameter is unused - the rejection reason is silently discarded. Consider logging the reason or providing a callback so clients can learn why their message was rejected.</comment>

<file context>
@@ -0,0 +1,425 @@
+}
+
+func (l *LocalMessageSubmission) handleRejectMessage(
+	msg *MsgRejectMessage,
+) error {
+	l.lock.Lock()
</file context>
Fix with Cubic

l.state,
)
}
l.state = StateBusy
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Race condition: state is set to StateBusy after Send() returns, but the async response handler (handleAcceptMessage/handleRejectMessage) expects StateBusy and may execute before this line runs. Set state to StateBusy before calling Send() and revert on failure.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/localmessagesubmission/localmessagesubmission.go, line 215:

<comment>Race condition: state is set to `StateBusy` after `Send()` returns, but the async response handler (`handleAcceptMessage`/`handleRejectMessage`) expects `StateBusy` and may execute before this line runs. Set state to `StateBusy` before calling `Send()` and revert on failure.</comment>

<file context>
@@ -0,0 +1,425 @@
+			l.state,
+		)
+	}
+	l.state = StateBusy
+	l.lock.Unlock()
+	return nil
</file context>
Fix with Cubic

func (m *MessageSubmission) SubmitDone() error {
m.lock.Lock()
if m.state != StateMessageIdsBlocking &&
m.state != StateMessageIdsNonBlocking &&
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: SubmitDone() allows sending Done from StateMessageIdsNonBlocking, but the state machine does not define this transition for protocolStateMessageIdsNonBlk. Only protocolStateMessageIdsBlock and protocolStateIdle have valid transitions for MessageTypeDone. This mismatch could cause protocol framework errors.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/messagesubmission/messagesubmission.go, line 541:

<comment>`SubmitDone()` allows sending Done from `StateMessageIdsNonBlocking`, but the state machine does not define this transition for `protocolStateMessageIdsNonBlk`. Only `protocolStateMessageIdsBlock` and `protocolStateIdle` have valid transitions for `MessageTypeDone`. This mismatch could cause protocol framework errors.</comment>

<file context>
@@ -0,0 +1,874 @@
+func (m *MessageSubmission) SubmitDone() error {
+	m.lock.Lock()
+	if m.state != StateMessageIdsBlocking &amp;&amp;
+		m.state != StateMessageIdsNonBlocking &amp;&amp;
+		m.state != StateIdle {
+		m.lock.Unlock()
</file context>
Fix with Cubic

return nil
}

func (l *LocalMessageNotification) handleClientDone() error {
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Missing state validation in handleClientDone. Per the state machine definition, ClientDone should only be valid from StateIdle, but this handler doesn't verify the current state before transitioning to StateDone.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At protocol/localmessagenotification/localmessagenotification.go, line 586:

<comment>Missing state validation in `handleClientDone`. Per the state machine definition, `ClientDone` should only be valid from `StateIdle`, but this handler doesn&#39;t verify the current state before transitioning to `StateDone`.</comment>

<file context>
@@ -0,0 +1,703 @@
+	return nil
+}
+
+func (l *LocalMessageNotification) handleClientDone() error {
+	l.lock.Lock()
+	defer l.lock.Unlock()
</file context>
Fix with Cubic

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
protocol/messagesubmission/messagesubmission.go (3)

775-795: Partial message enqueueing on queue-full error remains unfixed.

If the queue becomes full mid-iteration (line 787-788), previously added messages remain in the queue while the function returns an error. This creates an inconsistent state where only a subset of the reply is processed.

The previous review suggested pre-checking capacity. Consider implementing:

 	// Validate and enqueue messages
+	if len(m.messageQueue)+len(msg.Messages) > m.maxQueueSize {
+		return errors.New("insufficient queue capacity for all messages")
+	}
 	for _, message := range msg.Messages {

707-715: State transition before send in handleRequestMessageIds.

The state transitions to StateMessageIdsBlocking or StateMessageIdsNonBlocking (lines 708-711) before sendFn(reply) at line 714. If the send fails, the local state will be inconsistent with the peer's state.

Apply the same send-before-transition pattern used in the Submit* methods.


738-764: State transition to Idle occurs before send, risking desynchronization.

The fix for the collectMessagesByID data race is in place (called while holding lock). However, the state transitions to StateIdle at line 753 before sendFn(reply) at line 763. If sendFn fails, the local state will be Idle but the peer never received the response.

Apply the send-before-transition pattern:

 	m.state = StateMessages
 
 	// Collect messages while holding lock to prevent data race
 	messages := m.collectMessagesByID(req.MessageIDs)
-
-	// Transition back to idle while still holding lock to avoid races
-	m.state = StateIdle
 	m.lock.Unlock()
 
 	reply := &MsgReplyMessages{
 		MessageBase: protocol.MessageBase{
 			MessageType: MessageTypeReplyMessages,
 		},
 		Messages: messages,
 	}
 
-	return m.sendFn(reply)
+	if err := m.sendFn(reply); err != nil {
+		return err
+	}
+	
+	m.lock.Lock()
+	m.state = StateIdle
+	m.lock.Unlock()
+	return nil
 }
🧹 Nitpick comments (3)
protocol/messagesubmission/messages_test.go (1)

201-215: Consider adding field verification for other round-trip tests.

The MsgRequestMessageIds round-trip test (lines 189-199) properly verifies field preservation and byte-identical re-encoding, which aligns with the codebase's CBOR testing goals. For consistency, consider extending similar field-level assertions to MsgReplyMessageIds, MsgRequestMessages, and MsgReplyMessages to ensure their slice fields are correctly preserved after decoding.

Example for MsgReplyMessageIds:

 parsed, err = NewMsgFromCbor(uint(MessageTypeReplyMessageIds), data)
 assert.NoError(t, err)
 assert.Equal(t, uint8(MessageTypeReplyMessageIds), parsed.Type())
+if decoded, ok := parsed.(*MsgReplyMessageIds); ok {
+	assert.Equal(t, len(msgs), len(decoded.Messages))
+	assert.Equal(t, msgs[0].MessageID, decoded.Messages[0].MessageID)
+}
protocol/common/authentication_test.go (2)

29-35: Test accesses internal field kesOpCertCache.

The test asserts on auth.kesOpCertCache which appears to be an unexported field. This creates fragile tests that depend on internal implementation details. Consider either:

  1. Making this an exported field if it's part of the public API
  2. Removing this assertion if it's purely internal
  3. Adding a public method to check initialization state
 func TestMessageAuthenticatorCreation(t *testing.T) {
 	auth := NewMessageAuthenticator(nil)
 	assert.NotNil(t, auth)
-	// Public API checks
-	assert.NotNil(t, auth.kesOpCertCache)
 }

85-101: Test uses unexported method computePoolID.

Similar to the earlier observation, auth.computePoolID is called directly in tests. If this is intentional for testing internal logic, consider adding a comment explaining why, or expose it via a test-only mechanism.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3879be9 and 9e2643b.

📒 Files selected for processing (12)
  • protocol/common/authentication.go (1 hunks)
  • protocol/common/authentication_test.go (1 hunks)
  • protocol/common/dmq.go (1 hunks)
  • protocol/localmessagenotification/localmessagenotification.go (1 hunks)
  • protocol/localmessagenotification/messages.go (1 hunks)
  • protocol/localmessagenotification/messages_test.go (1 hunks)
  • protocol/localmessagesubmission/localmessagesubmission.go (1 hunks)
  • protocol/localmessagesubmission/messages.go (1 hunks)
  • protocol/localmessagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messages.go (1 hunks)
  • protocol/messagesubmission/messages_test.go (1 hunks)
  • protocol/messagesubmission/messagesubmission.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • protocol/localmessagenotification/messages_test.go
  • protocol/localmessagesubmission/messages_test.go
  • protocol/localmessagesubmission/messages.go
  • protocol/localmessagesubmission/localmessagesubmission.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-04T15:54:22.683Z
Learnt from: wolf31o2
Repo: blinklabs-io/gouroboros PR: 1093
File: ledger/mary/pparams.go:143-150
Timestamp: 2025-11-04T15:54:22.683Z
Learning: In the blinklabs-io/gouroboros repository, the design goal for CBOR round-trip tests is to achieve byte-identical encoding WITHOUT using stored CBOR (cbor.DecodeStoreCbor). Instead, the approach uses proper field types (pointers for optional fields) and relies on the cbor package's deterministic encoding (SortCoreDeterministic) to ensure reproducible output. The stored CBOR pattern should not be suggested as a solution for round-trip fidelity in this codebase.

Applied to files:

  • protocol/messagesubmission/messages.go
  • protocol/messagesubmission/messages_test.go
  • protocol/common/authentication.go
  • protocol/common/dmq.go
🧬 Code graph analysis (2)
protocol/messagesubmission/messagesubmission.go (3)
protocol/state.go (6)
  • NewState (38-43)
  • StateMapEntry (62-67)
  • AgencyClient (27-27)
  • StateTransition (51-55)
  • AgencyServer (28-28)
  • AgencyNone (26-26)
protocol/messagesubmission/messages.go (12)
  • MessageTypeInit (32-32)
  • MessageTypeRequestMessageIds (33-33)
  • MsgRequestMessageIds (46-51)
  • MessageTypeRequestMessages (35-35)
  • MessageTypeDone (37-37)
  • MessageTypeReplyMessageIds (34-34)
  • MessageTypeReplyMessages (36-36)
  • MsgInit (41-43)
  • MsgReplyMessageIds (54-57)
  • MsgRequestMessages (60-63)
  • MsgReplyMessages (66-69)
  • MsgDone (72-74)
protocol/common/dmq.go (2)
  • Message (53-63)
  • MessageIDAndSize (66-72)
protocol/common/dmq.go (1)
cbor/cbor.go (1)
  • StructAsArray (45-48)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (32)
protocol/messagesubmission/messages_test.go (1)

1-24: LGTM! Well-structured test file with comprehensive coverage.

The test file properly covers encoding, decoding, and CBOR round-trips for all message types. The imports are appropriate and the structure follows Go testing conventions.

protocol/common/dmq.go (3)

27-72: LGTM! Well-designed DMQ structures with proper CBOR array encoding.

The struct definitions correctly embed cbor.StructAsArray for CIP-0137 wire format compliance. Field sizes match cryptographic expectations (32-byte Ed25519 keys, 64-byte signatures, 448-byte KES signatures).


74-79: IsValid logic is correct.

The <= comparison correctly treats the expiration timestamp as inclusive (message is valid up to and including the expiration moment). The #nosec G115 annotation appropriately documents the uint32 overflow concern.


81-124: Clean rejection reason framework.

The RejectReason interface with concrete implementations follows idiomatic Go patterns. Type codes 0-3 align with the specification, and the use of empty structs for AlreadyReceivedReason and ExpiredReason is appropriate.

protocol/common/authentication_test.go (3)

136-262: Comprehensive TTL validation tests.

Good coverage of edge cases including default values, custom TTL, expired messages, valid messages, too-far-future expiration, and nil handling. The time range assertions (lines 234-235) appropriately account for test execution time variance.


399-439: Well-designed test helper for KES verification scenarios.

The buildTestMessage helper properly generates ed25519 keypairs and creates valid operational certificates with correct CBOR-encoded signatures. This enables thorough testing of KES verification paths.


441-534: Thorough KES verifier injection tests.

The tests comprehensively cover:

  • Successful verification with injected verifier
  • Verifier rejection (returns false, nil)
  • Verifier error (returns false, error)
  • Default verifier application via package-level setter

The closure pattern for capturing call parameters is clean and effective.

protocol/messagesubmission/messages.go (4)

25-38: Protocol constants are well-defined.

The protocol ID (17) and message type codes (0-5) align with CIP-0137 specification. Using named constants improves readability and maintainability.


40-74: Message structs are well-designed.

All message types correctly embed protocol.MessageBase for consistent CBOR array encoding. Field types are appropriate: uint16 for counts, []byte for message IDs, and references to pcommon types for complex structures maintain consistency with the DMQ type definitions.


76-105: Clean factory function with proper error handling.

The NewMsgFromCbor factory correctly handles all message types, provides descriptive error messages with protocol context, and stores the raw CBOR for potential re-serialization. Error wrapping with %w preserves the underlying decode error.


107-135: Type() methods correctly implemented.

All six Type() methods return the appropriate message type constant. The implementations satisfy the protocol.Message interface.

protocol/localmessagenotification/messages.go (3)

25-36: Protocol constants properly defined for Local Message Notification.

Protocol ID 19 and message type codes (0-3) align with CIP-0137 specification for the local message notification protocol.


38-60: Message structs reflect blocking/non-blocking semantics correctly.

The design separates blocking and non-blocking reply types explicitly:

  • MsgReplyMessagesNonBlocking includes HasMore to indicate additional messages may be available
  • MsgReplyMessagesBlocking omits HasMore since blocking calls wait for messages

This type-safe approach prevents misuse of the HasMore field in blocking contexts.


62-107: Factory function and Type() methods are consistent with codebase patterns.

The NewMsgFromCbor factory and Type() methods follow the same well-structured pattern as the Message Submission protocol, maintaining codebase consistency.

protocol/localmessagenotification/localmessagenotification.go (7)

1-113: State machine and timeout definitions look correct.

The protocol states, timeouts, and state map transitions align with CIP-0137 specifications. The MatchFunc usage for differentiating blocking vs non-blocking requests is appropriate.


162-171: Constructor properly initializes all fields with sensible defaults.

The done channel and newMessageSignalClosed flag address the previously raised concurrency concerns. The authenticator and TTL validator defaults are applied correctly.


194-231: AddMessage correctly guards against concurrent Close().

The fix addresses the previous deadlock issue by reading newMessageSignalClosed directly while holding the lock (as noted in the comment on lines 222-223).


234-271: State transitions after send correctly implemented.

The pattern of unlocking before send, then re-acquiring the lock and verifying state hasn't changed before transitioning addresses the previously raised state desynchronization concerns.


432-458: Message queue consumption logic correctly implemented.

Messages returned in available are no longer added to filtered (lines 443-448), fixing the duplicate delivery bug. The acknowledgedIDs map is cleared after use (line 453), preventing unbounded memory growth.


586-595: State validation added to handleClientDone.

The handler now validates that the state is StateIdle before transitioning to StateDone, addressing the previously raised concern about missing state validation.


674-706: Close() properly guards against concurrent channel operations.

The implementation uses a done channel with select to prevent double-close panics (lines 680-685), and sets newMessageSignalClosed before closing the channel to prevent sends after close.

protocol/messagesubmission/messagesubmission.go (4)

50-152: State machine correctly defines protocol states and transitions.

The state map properly configures agencies, timeouts, and transitions per CIP-0137. The MatchFunc differentiates blocking vs non-blocking message ID requests correctly.


538-575: SubmitDone state check aligns with state machine.

The state check now correctly allows StateMessageIdsBlocking and StateIdle (lines 540-541), matching the state map transitions defined at lines 103-106 and 118-121. The previous concern about StateMessageIdsNonBlocking mismatch is addressed.


577-610: GetAvailableMessages correctly removes returned messages from queue.

The logic at lines 598-604 ensures delivered messages are not appended to filtered, fixing the previous duplicate delivery bug.


862-870: Close implementation is straightforward and correct.

The method properly transitions to StateDone, stops the protocol, and clears the queue under lock protection.

protocol/common/authentication.go (7)

35-55: MessageAuthenticator properly protects maps with RWMutex.

The mu sync.RWMutex field (line 38) addresses the previous data race concern. The AllowInsecureKES flag and kesVerifier callback provide the flexibility needed for both production and test environments.


57-82: Thread-safe default KES verifier management with atomic.Value.

The defaultKESVerifier uses atomic.Value for race-free concurrent access, addressing the previous synchronization concern.


108-129: Pool registration methods use proper read/write locking.

RegisterSPOPool and UnregisterSPOPool use exclusive locks, while IsSPOPoolRegistered correctly uses a read lock for concurrent read access.


267-334: KES verification properly handles production vs test scenarios.

The implementation correctly:

  1. Wraps the payload as bstr .cbor messagePayload (lines 267-273) without array nesting
  2. Uses injected verifier when available (lines 283-316)
  3. Provides explicit insecure bypass with warning log for tests (lines 320-327)
  4. Fails hard in production when no verifier is available (lines 329-333)

364-387: KES period rotation check prevents replay attacks with stale credentials.

The method properly uses mutex locking for thread-safe cache access and rejects operational certificates with issue numbers that go backwards.


437-454: TTL validation correctly prevents integer overflow.

The computation uses uint64 (lines 439-440) and clamps to math.MaxUint32 (lines 441-444), addressing the previous overflow concern when maxAllowedTTL is large.


459-473: GetTimeUntilExpiration provides a clean utility for TTL checking.

Simple and correct implementation for computing remaining message lifetime.

Signed-off-by: Chris Gianelloni <wolf31o2@blinklabs.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant