Conversation
Committer: Alka Trivedi <alkatrivedi12dec@gmail.com> On branch benchmarking Changes to be committed: new file: samples/samples/benchmarking.py
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
| def query_data(thread_id): | ||
| print("running thread ", thread_id) | ||
| start_time = time.time() | ||
| time.sleep(10) |
There was a problem hiding this comment.
remove this sleep time, instead you can increase the # of transaction per run if you want the benchmark to execute over a long period of time.
|
|
||
| # Create a Spanner database instance | ||
| instance = spanner_client.instance(instance_id) | ||
| pool = pool.FixedSizePool(size = 10, logging_enabled=True) |
There was a problem hiding this comment.
- What's the default size size of
FixedSizePool? Can we avoid passingsize = 10and use the default size? That will take us closer to what a general customer will use. - @surbhigarg92 Are we using
logging_enabled=Trueas a global option? I thought we had refactored this to be a pool option similar toclose_inactive_transactions
| p90Index = math.floor(0.9*len(latencies)) | ||
| p90Latency = latencies[p90Index] | ||
|
|
||
| return [p50Latency, p90Latency] |
| pool = pool.FixedSizePool(size = 10, logging_enabled=True) | ||
| database = instance.database(pool=pool, database_id=database_id, close_inactive_transactions=True) | ||
|
|
||
| transaction_time = [] |
There was a problem hiding this comment.
A global variable for transaction_time is mixing up the results of different kinds of transactions. I see you have 3 types here - executeSql, batch transaction and DML (insert). It would be good to profile what is the latency for each type of transaction so that we understand if any of them is regressing or not.
| "SELECT 1 FROM Singers" | ||
| ) | ||
|
|
||
| # for row in results: |
There was a problem hiding this comment.
Nitpick: could you remove the commented out code?
| # [END spanner_query_data] | ||
|
|
||
| # [START spanner_batch_transaction] | ||
| def batch_transaction(thread_id): |
There was a problem hiding this comment.
Where is this function batch_transaction invoked? Similarly where is insert_with_dml invoked?
| # [END insert_with_dml] | ||
|
|
||
| # Define the number of threads | ||
| num_threads = 20 |
There was a problem hiding this comment.
- At max only 10 threads will be used up since there are only 10 sessions. Post that 10 threads will always wait.
- Currently you are running 1 query per thread, it will be good to increase the number of transactions per thread. For example, 2000 transactions in total. 20 threads will make it 200 transactions per thread. This will ensure the run executes for more time and does not finish in < 1 second. This will also allow you to remove sleep() of 10s.
Benchmarking long running session removals