Skip to content

Conversation

@yanxinyi
Copy link

…source v2

@yanxinyi
Copy link
Author

yanxinyi commented Apr 30, 2019

used setCacheBlocks to disable the block cache. 
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks-boolean-
Current implementation deprecated the PhoenixRDD.scala file, so I modified PhoenixDataSource.java with an additional option.

this.tableName = options.tableName().get();
this.zkUrl = options.get(PhoenixDataSource.ZOOKEEPER_URL).get();
this.dateAsTimestamp = options.getBoolean("dateAsTimestamp", false);
this.disableBlockCache = options.getBoolean("NO_CACHE", false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use Hint.NO_CACHE instead of the String for consistency?

final QueryPlan queryPlan = pstmt.optimizeQuery(selectStatement);
final Scan scan = queryPlan.getContext().getScan();
if (this.disableBlockCache) {
scan.setCacheBlocks(false);
Copy link
Contributor

@ChinmaySKulkarni ChinmaySKulkarni Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scan variable is unused. You can actually remove it. You should be setting this on each scan in the queryPlan, otherwise the Spark executor scans will not have this hint set. Instead of iterating over each scan here, it may be easier to set this in PhoenixDataSourceReadOptions. We create an instance of this when we call PhoenixDataSourceReader#planInputPartitions() from the driver. Also, these are embedded in each of our InputPartitions, so the read options are available to us on the Spark executors (see PhoenixInputPartitionReader#initialize()). Here we are iterating over the scans and you can use the set value in the read options to setCacheBlocks to false.

Also, in case this hint is provided, you should make sure any other scan objects used on the driver also has this property set for example, the scan that we use on the driver-side to get the region locations.

private static final String V1 = "v1";
private static final String V2 = "v2";
private static final String V3 = "v3";
private static final String NO_CACHE = "NO_CACHE";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about using already defined enum value Hint.NO_CACHE

assertEquals(V2, p.getProperty(P2));
assertEquals(V3, p.getProperty(P3));
assertEquals(V3, p.getProperty(P3));
assertEquals(true, Boolean.valueOf(p.getProperty(NO_CACHE)));
Copy link
Contributor

@ChinmaySKulkarni ChinmaySKulkarni Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test isn't testing your change at all..Here you are using the extraOptions to set the property and just checking that the property is set. Ideally, we want to use extraOptions to set HBase/Phoenix properties if they are valid configs we set, in say hbase-site.xml. In this case, NO_CACHE is not such a config so we are using a different way to set this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants