-
Notifications
You must be signed in to change notification settings - Fork 6
feat: add method to parse image from fsuri, like btcfs, ipfs and ckbfs #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @ashuralyk, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly enhances the dob-decoder-server by introducing the capability to parse and extract images from decentralized file system URIs like BTCFS and IPFS. It also expands the server's accessibility by implementing a new RESTful API alongside the existing JSON-RPC interface. Furthermore, it includes crucial dependency updates, improves caching mechanisms for decoders, and refines DNA content parsing for greater robustness.
Highlights
- New Feature: Image Parsing from FSURI: Introduced a new method to parse and extract images from decentralized file system URIs, specifically
btcfs://andipfs://. This involves fetching data from external services like Mempool.space for BTCFS and IPFS gateways for IPFS content, and then extracting the image data. - RESTful API Support: A new RESTful API server has been implemented using the
axumframework, running concurrently with the existing JSON-RPC server. This provides alternative endpoints for decoding operations and the newly added image extraction functionality. - Extensive Dependency Updates: Numerous Rust crate dependencies have been updated across
Cargo.lockandCargo.toml. Notable updates includejsonrpsee(0.22.3 to 0.26.0),hyper(1.3.1 to 1.6.0),tokio(1.37.0 to 1.47.1), andsyn(2.0.57 to 2.0.106). Many new crates were also added to support the new features and updated dependencies. - Improved Decoder Caching: The decoder caching mechanism now includes an expiration policy for
type_idandtype_scriptdecoders. This allows the server to periodically refresh cached decoders, ensuring they remain up-to-date rather than being cached indefinitely after their initial fetch. - Robust DNA Content Parsing: The
decode_spore_contentfunction has been enhanced to gracefully handle content that is not valid JSON. If the content cannot be parsed as JSON, it is now treated as a direct DNA string, improving the robustness of content interpretation.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in pull request comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant new feature to parse images from filesystem URIs like btcfs and ipfs, and also refactors the server to support both JSON-RPC and a new RESTful API using Axum. The changes are extensive, including new dependencies, configuration options, and an image fetching client. While the implementation is mostly solid, I've identified a few areas for improvement regarding error handling, feature completeness, code clarity, and consistency.
| .get("btcfs") | ||
| .ok_or(Error::FsuriNotFoundInConfig)? | ||
| .join(&tx_hash) | ||
| .expect("image url"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using expect() can lead to a panic if the URL join operation fails, for instance, if tx_hash contains characters that are not allowed in a URL path segment. It's safer to handle the Result and return a proper error to make the client more robust.
| .expect("image url"); | |
| .join(&tx_hash) | |
| .map_err(|e| Error::InvalidOnchainFsuriFormat)?; |
| impl TryFrom<&String> for URI { | ||
| type Error = Error; | ||
|
|
||
| fn try_from(uri: &String) -> Result<Self, Error> { | ||
| if let Some(body) = uri.strip_prefix("btcfs://") { | ||
| let parts: Vec<&str> = body.split('i').collect::<Vec<_>>(); | ||
| if parts.len() != 2 { | ||
| return Err(Error::InvalidOnchainFsuriFormat); | ||
| } | ||
| let tx_hash = parts[0].to_string(); | ||
| let index = parts[1] | ||
| .parse() | ||
| .map_err(|_| Error::InvalidOnchainFsuriFormat)?; | ||
| Ok(URI::BTCFS(tx_hash, index)) | ||
| } else if let Some(body) = uri.strip_prefix("ipfs://") { | ||
| let hash = body.to_string(); | ||
| Ok(URI::IPFS(hash)) | ||
| } else { | ||
| Err(Error::InvalidOnchainFsuriFormat) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #[method(name = "dob_extract_image")] | ||
| async fn extract_image( | ||
| &self, | ||
| fsuri: String, | ||
| encode_type: Option<String>, | ||
| ) -> Result<String, ErrorObjectOwned>; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RPC method is named dob_extract_image here, but the README.md example and other parts of the code refer to it as dob_extract_image_from_fsuri. This inconsistency can be confusing for API users. It's best to use a consistent and descriptive name across the board.
| #[method(name = "dob_extract_image")] | |
| async fn extract_image( | |
| &self, | |
| fsuri: String, | |
| encode_type: Option<String>, | |
| ) -> Result<String, ErrorObjectOwned>; | |
| } | |
| #[method(name = "dob_extract_image_from_fsuri")] | |
| async fn extract_image_from_fsuri( | |
| &self, | |
| fsuri: String, | |
| encode_type: Option<String>, | |
| ) -> Result<String, ErrorObjectOwned>; |
| if let Some(captures) = header_pattern.captures(inscription) { | ||
| let matched_header = &captures[0]; | ||
| let base_removed = inscription.replace(matched_header, ""); | ||
| let hexed = regex_replace_all!(r#"\s?OP\_\w+\s?"#, &base_removed, ""); | ||
| let image = hex::decode(hexed.as_bytes()) | ||
| .map_err(|_| Error::InvalidInscriptionContentHexFormat)?; | ||
| images.push(image); | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex for the inscription header captures the content type of the inscription, but this information is not used. The captures are only used to get the full matched header string for removal. If the content type is important for any downstream processing or validation, it should be extracted and potentially returned from this function. If it's not needed, consider simplifying the regex to not capture these groups for a minor performance improvement and clearer intent.
src/decoder/helpers.rs
Outdated
| fn file_older_than_hours(file_path: &PathBuf, hours: u64) -> bool { | ||
| match std::fs::metadata(file_path) { | ||
| Ok(metadata) => { | ||
| let Ok(mut duration) = SystemTime::now().duration_since(SystemTime::UNIX_EPOCH) else { | ||
| return true; | ||
| }; | ||
| if let Ok(Ok(checkpoint)) = metadata | ||
| .modified() | ||
| .map(|time| time.duration_since(SystemTime::UNIX_EPOCH)) | ||
| { | ||
| duration = duration.saturating_sub(checkpoint); | ||
| } else if let Ok(Ok(checkpoint)) = metadata | ||
| .created() | ||
| .map(|time| time.duration_since(SystemTime::UNIX_EPOCH)) | ||
| { | ||
| duration = duration.saturating_sub(checkpoint); | ||
| } | ||
| duration.as_secs() / 3600 >= hours | ||
| } | ||
| Err(_) => true, | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic in file_older_than_hours is a bit complex and hard to follow. It can be simplified for better readability and maintainability by using Duration::elapsed().
fn file_older_than_hours(file_path: &PathBuf, hours: u64) -> bool {
if hours == 0 {
return true; // As per config comments, 0 means always update.
}
let metadata = match std::fs::metadata(file_path) {
Ok(m) => m,
Err(_) => return true, // File doesn't exist, so it's "old".
};
if let Ok(modified_time) = metadata.modified().or_else(|_| metadata.created()) {
if let Ok(elapsed) = modified_time.elapsed() {
return elapsed.as_secs() >= hours * 3600;
}
}
true // If we can't determine file age, assume it's old.
}| match encode_type.as_deref() { | ||
| Some("hex") | None => Ok(hex::encode(image)), | ||
| Some("base64") => Ok(STANDARD.encode(image)), | ||
| unknown => Err(ErrorObjectOwned::owned::<serde_json::Value>( | ||
| -1, | ||
| format!( | ||
| "Unknown encode type: {}. Supported types: 'base64', 'hex' (default)", | ||
| unknown.unwrap_or("unknown") | ||
| ), | ||
| None, | ||
| )), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pattern matching for encode_type can be made clearer. The unknown variable is of type Option<&str>, which is a bit confusing. Matching on Some(unknown_str) would make the intent clearer and improve readability.
| match encode_type.as_deref() { | |
| Some("hex") | None => Ok(hex::encode(image)), | |
| Some("base64") => Ok(STANDARD.encode(image)), | |
| unknown => Err(ErrorObjectOwned::owned::<serde_json::Value>( | |
| -1, | |
| format!( | |
| "Unknown encode type: {}. Supported types: 'base64', 'hex' (default)", | |
| unknown.unwrap_or("unknown") | |
| ), | |
| None, | |
| )), | |
| } | |
| match encode_type.as_deref() { | |
| Some("hex") | None => Ok(hex::encode(image)), | |
| Some("base64") => Ok(STANDARD.encode(image)), | |
| Some(unknown_type) => Err(ErrorObjectOwned::owned::<serde_json::Value>( | |
| -1, | |
| format!( | |
| "Unknown encode type: {}. Supported types: 'base64', 'hex' (default)", | |
| unknown_type | |
| ), | |
| None, | |
| )), | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant new feature for parsing images from btcfs and ipfs URIs, and exposes this functionality through both a new JSON-RPC method and a new RESTful API server. The changes are extensive, including new dependencies, a new ImageFetchClient, and a major refactoring of the server module to support both RPC and REST interfaces. While the new functionality is a great addition, I've found a few issues that should be addressed. These include a critical structural problem with duplicated server code, a potential parsing bug in the Bitcoin script handling, opportunities for improved error handling and code simplification, and some inconsistencies in the documentation. Addressing these points will improve the robustness, maintainability, and clarity of the new features.
| mod core; | ||
| mod jsonrpc; | ||
| mod restful; | ||
| mod utils; | ||
|
|
||
| pub use core::DecoderStandaloneServer; | ||
| pub use jsonrpc::DecoderRpcServer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a significant structural issue with the new server module. Both src/server/core.rs and src/server/jsonrpc.rs define a DecoderRpc trait and an impl DecoderRpcServer block. This code duplication is confusing and will likely lead to compilation errors due to multiple conflicting implementations.
Additionally, the RPC method for image extraction is named dob_extract_image_from_fsuri in core.rs but dob_extract_image in jsonrpc.rs, which is inconsistent.
It's critical to refactor this to have a single source of truth. I recommend consolidating all the server logic and RPC definitions into src/server/core.rs and removing the duplicated code in src/server/jsonrpc.rs to resolve this ambiguity.
| while let (Some(start), Some(end)) = (witness_view.find("OP_IF"), witness_view.find("OP_ENDIF")) | ||
| { | ||
| if start >= end { | ||
| return Err(Error::InvalidInscriptionFormat( | ||
| "bad start and end position".to_string(), | ||
| )); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current logic for finding inscription envelopes uses witness_view.find("OP_IF") and witness_view.find("OP_ENDIF"). This approach will not correctly handle nested OP_IF/OP_ENDIF blocks, as it will match the first OP_IF with the first OP_ENDIF it finds, which may not be the correct corresponding one. This could lead to incorrect parsing of valid Bitcoin scripts.
To fix this, consider implementing a more robust parsing mechanism, such as a stack-based approach, to correctly identify matching OP_IF and OP_ENDIF pairs.
| ```bash | ||
| $ echo '{ | ||
| "id": 3, | ||
| "jsonrpc": "2.0", | ||
| "method": "dob_decode_svg", | ||
| "params": [ | ||
| "0x577bf0de0dcffe2811fa827480a700bc800c8e1e9606615b1484baeea2cba830" | ||
| ] | ||
| }' \ | ||
| | curl -H 'content-type: application/json' -d @- \ | ||
| http://localhost:8090 | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation has been updated to include an example for a dob_decode_svg method. However, the pull request description states that this method was purged, and I can't find its implementation in the new code. This is confusing for users. Please remove this example from the README to align the documentation with the actual implementation.
| .join(&tx_hash) | ||
| .expect("image url"); | ||
| requests.push(parse_image_from_btcfs(url, index).boxed()); | ||
| } | ||
| URI::IPFS(cid) => { | ||
| let url = self | ||
| .base_url | ||
| .get("ipfs") | ||
| .ok_or(Error::FsuriNotFoundInConfig)? | ||
| .join(&cid) | ||
| .expect("image url"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using .expect() will cause the program to panic if joining the URL fails. While it's unlikely with the current URIs, it's better to handle this potential failure gracefully by returning a Result. This makes the client more robust.
| .join(&tx_hash) | |
| .expect("image url"); | |
| requests.push(parse_image_from_btcfs(url, index).boxed()); | |
| } | |
| URI::IPFS(cid) => { | |
| let url = self | |
| .base_url | |
| .get("ipfs") | |
| .ok_or(Error::FsuriNotFoundInConfig)? | |
| .join(&cid) | |
| .expect("image url"); | |
| .join(&tx_hash) | |
| .map_err(|_| Error::InvalidOnchainFsuriFormat)?; | |
| requests.push(parse_image_from_btcfs(url, index).boxed()); | |
| } | |
| URI::IPFS(cid) => { | |
| let url = self | |
| .base_url | |
| .get("ipfs") | |
| .ok_or(Error::FsuriNotFoundInConfig)? | |
| .join(&cid) | |
| .map_err(|_| Error::InvalidOnchainFsuriFormat)?; |
| let header_pattern = regex!( | ||
| r#"OP_IF\s+OP_PUSHBYTES_3\s+444f42\s+OP_PUSHBYTES_1\s+01\s+OP_PUSHBYTES_(\d+)\s+([0-9a-fA-F]+)\s+OP_0\s+OP_PUSHDATA2\s+"# | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex header_pattern includes capture groups for the length of the content type and the content type itself. However, these captures are not being used in the parsing logic. This is a missed opportunity to perform validation on the inscription header, for example, by checking the content type. Using these captures would make the parsing more robust.
src/decoder/helpers.rs
Outdated
| fn file_older_than_hours(file_path: &PathBuf, hours: u64) -> bool { | ||
| match std::fs::metadata(file_path) { | ||
| Ok(metadata) => { | ||
| let Ok(mut duration) = SystemTime::now().duration_since(SystemTime::UNIX_EPOCH) else { | ||
| return true; | ||
| }; | ||
| if let Ok(Ok(checkpoint)) = metadata | ||
| .modified() | ||
| .map(|time| time.duration_since(SystemTime::UNIX_EPOCH)) | ||
| { | ||
| duration = duration.saturating_sub(checkpoint); | ||
| } else if let Ok(Ok(checkpoint)) = metadata | ||
| .created() | ||
| .map(|time| time.duration_since(SystemTime::UNIX_EPOCH)) | ||
| { | ||
| duration = duration.saturating_sub(checkpoint); | ||
| } | ||
| duration.as_secs() / 3600 >= hours | ||
| } | ||
| Err(_) => true, | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of file_older_than_hours can be simplified and made more readable by using SystemTime::elapsed(). This avoids manual duration calculations and makes the intent clearer. Also, the case where hours is 0 should be handled explicitly to match the documented behavior ("zero means always update").
fn file_older_than_hours(file_path: &PathBuf, hours: u64) -> bool {
if hours == 0 {
return true; // Always update if expiration is 0, as per documentation.
}
if let Ok(metadata) = std::fs::metadata(file_path) {
if let Ok(modified) = metadata.modified() {
if let Ok(elapsed) = modified.elapsed() {
return elapsed.as_secs() >= hours * 3600;
}
}
}
// Default to re-fetching if metadata or time fails, or if file doesn't exist.
true
}|
This PR runs on Testnet and Mainnet for a while, it seems ok to merge |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant new feature for parsing images from btcfs and ipfs URIs, along with a new RESTful API server. The server code has been refactored into a more modular structure, which is a great improvement.
My review focuses on a few areas:
- Code Duplication: I've identified some duplicated code in the server implementation that should be removed.
- Error Handling: There are a couple of places where using
.expect()could lead to a panic; I've suggested using proper error handling instead. - Documentation: The
README.mdcontains outdated and incorrect examples for the new RPC methods. I've pointed these out so they can be corrected.
Overall, this is a solid contribution with a lot of new functionality. Addressing these points will improve the robustness and maintainability of the code.
| **Decode and extract SVG from another example spore ID on mainnet:** | ||
|
|
||
| ```bash | ||
| $ echo '{ | ||
| "id": 3, | ||
| "jsonrpc": "2.0", | ||
| "method": "dob_decode_svg", | ||
| "params": [ | ||
| "0x577bf0de0dcffe2811fa827480a700bc800c8e1e9606615b1484baeea2cba830" | ||
| ] | ||
| }' \ | ||
| | curl -H 'content-type: application/json' -d @- \ | ||
| http://localhost:8090 | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| $ echo '{ | ||
| "id": 4, | ||
| "jsonrpc": "2.0", | ||
| "method": "dob_extract_image_from_fsuri", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RPC method name in this example is incorrect. The implementation in src/server/jsonrpc.rs uses dob_extract_image, but the example shows dob_extract_image_from_fsuri. Please update the example to use the correct method name.
| "method": "dob_extract_image_from_fsuri", | |
| "method": "dob_extract_image", |
| .join(&tx_hash) | ||
| .expect("image url"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using .expect() can cause the server to panic if joining the URL fails, for example if the transaction hash is not a valid URL path segment. It's better to handle this potential error gracefully by converting it into a Result.
| .join(&tx_hash) | |
| .expect("image url"); | |
| .join(&tx_hash) | |
| .map_err(|_| Error::InvalidOnchainFsuriFormat)?; |
| .join(&cid) | ||
| .expect("image url"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using .expect() can cause the server to panic if joining the URL fails, for example if the CID is not a valid URL path segment. It's better to handle this potential error gracefully by converting it into a Result.
| .join(&cid) | |
| .expect("image url"); | |
| .join(&cid) | |
| .map_err(|_| Error::InvalidOnchainFsuriFormat)?; |
…piration for typed decoder
f58f5b6 to
f1e138b
Compare
just pushing a new commit to purge useless method of
dob_decode_svgto keep the another one that parsing image from a particular fsuri