1 00:00:01,466 --> 00:00:08,099 Welcome to a new episode of High-Performance Java Persistence video course. 2 00:00:08,100 --> 00:00:14,299 In this lesson, you are going to find out why JDBC logging is very important when using JPA and Hibernate, 3 00:00:14,300 --> 00:00:20,332 what configuration does Hibernate have to offer to customize statement logging, and why a JDBC DataSource 4 00:00:20,333 --> 00:00:26,399 or Driver Proxy is a much more flexible alternative to Hibernate logging. 5 00:00:26,400 --> 00:00:29,765 First of all, if you are working with JPA and Hibernate, 6 00:00:29,766 --> 00:00:37,032 then you need statement logging. Unlike plain JDBC, where you have absolute control over what statement are sent to the database, 7 00:00:37,033 --> 00:00:44,999 JPA and Hibernate generate statements based on various entity state transitions like persist, merge or remove. If you don't use statement logging, 8 00:00:45,000 --> 00:00:52,999 then you will never know what Hibernate executes on your behalf. And that's bad because, if you don't validate the auto-generated statements, 9 00:00:53,000 --> 00:01:01,000 you might bump into various data access performance issues. In this episode, you are going to see how we can automate the statement 10 00:01:01,866 --> 00:01:09,866 validation process so that your initial data access assumptions still hold even if the application code evolves with time. 11 00:01:10,766 --> 00:01:18,665 Hibernate offers three log-based configuration properties: show_sql, format_sql, and use_sql_comments. 12 00:01:18,666 --> 00:01:25,499 The first one, show_sql, is to be avoided. That's because it prints the statements to the Console. 13 00:01:25,500 --> 00:01:32,132 And, for the same reasons you are using a Logging framework like Log4j or Logback instead of printing to System. out, you have to use a 14 00:01:32,133 --> 00:01:36,599 a logging framework for Hibernate-generated SQL statements as well. 15 00:01:36,600 --> 00:01:43,732 The second one, format_sql, can be used to print the SQL statement using a multi-line formatter. 16 00:01:43,733 --> 00:01:50,999 The third one, use_sql_comments, causes Hibernate to annotate SQL statements with various comments 17 00:01:51,000 --> 00:01:55,465 which can help you to understand why a given statement was generated in a certain way. 18 00:01:57,033 --> 00:02:05,033 Behind the scenes, Hibernate uses the JBoss Logging framework which acts as a bridge to the actual logging framework in use by our enterprise 19 00:02:05,800 --> 00:02:13,800 application. So, you can use any logging framework you want like Log4j or Log4j 2, Logback via SLF4J or Java Logging. 20 00:02:14,600 --> 00:02:15,799 Now, 21 00:02:15,800 --> 00:02:21,632 when persisting a Post entity, Hibernate is going to output the following SQL INSERT statement. 22 00:02:21,633 --> 00:02:25,299 By default, statements are logged using placeholders. 23 00:02:25,300 --> 00:02:30,432 Later, we are going to see how to print bind variables as well. 24 00:02:30,433 --> 00:02:35,265 But, prior to doing that, let's see what happens if we activate statement formatting. 25 00:02:35,266 --> 00:02:43,099 The format_sql setting takes effect prior to printing the SQL statement. So, it's only applicable for Hibernate statement logging. It will not work if 26 00:02:43,100 --> 00:02:48,365 you are using an external statement logging framework. 27 00:02:48,366 --> 00:02:54,099 When you enable the format_sql configuration property and persist the 28 00:02:54,100 --> 00:02:59,865 same Post entity, you are going to see the following SQL INSERT statement printed in your log. 29 00:02:59,866 --> 00:03:07,199 When you enable statement comments, Hibernate is going to append various SQL-level comments when generating the statement like 30 00:03:07,200 --> 00:03:12,132 the entity state transition that triggered the current executing statement, 31 00:03:12,133 --> 00:03:16,699 the reason for choosing a join when fetching a given result set, 32 00:03:16,700 --> 00:03:21,265 or the explicit locking mechanism employed by the current statement. 33 00:03:21,266 --> 00:03:25,465 Unlike statement formatting, comments are sent to the JDBC driver. 34 00:03:25,466 --> 00:03:30,265 Because it adds an extra overhead to the networking I/O, you don't want to leave 35 00:03:30,266 --> 00:03:35,099 this configuration enabled in a production environement. 36 00:03:35,100 --> 00:03:38,632 If you want to log the Prepared Statement bind parameter values, 37 00:03:38,633 --> 00:03:44,599 you have to add a new appender for the SQL type descriptor Hibernate package. 38 00:03:44,600 --> 00:03:51,799 Then, Hibernate is going to log both the type and the value used for each particular bind parameter. 39 00:03:52,701 --> 00:03:58,932 Even if Hibernate logging allows you to print both the prepared statements and their runtime bind parameter values, 40 00:03:58,933 --> 00:04:04,365 an external JDBC Statement Proxy is usually preferred. Either the JDBC Driver 41 00:04:04,366 --> 00:04:09,065 or the DataSource can be proxied and, this way, we can intercept any statement execution and 42 00:04:09,066 --> 00:04:13,799 log more than just the statement and its bind parameter values. 43 00:04:13,800 --> 00:04:21,800 Besides custom statement logging, a JDBC Proxy can provide other cross-cutting concerns like detecting long-running statements 44 00:04:21,966 --> 00:04:26,065 or validating the number of statements being executed. 45 00:04:26,066 --> 00:04:34,066 P6Spy is an open-source framework that supports a declarative configuration approach via an external properties file. You can use 46 00:04:34,166 --> 00:04:39,565 P6Spy to proxy either the JDBC Driver, which is very suitable for Java EE 47 00:04:39,566 --> 00:04:41,299 or the JDBC DataSource, 48 00:04:41,300 --> 00:04:45,899 which is usually the case for Spring-based enterprise applications. 49 00:04:45,900 --> 00:04:53,065 P6Spy prints more than just the SQL statement itself. It can capture the current timestamp, 50 00:04:53,066 --> 00:04:55,299 the statement execution time, 51 00:04:55,300 --> 00:04:57,832 the statement type: simple or batched, 52 00:04:57,833 --> 00:05:01,465 the database connection used by the current executing statement, 53 00:05:01,466 --> 00:05:04,599 the original statememnt with placeholders, 54 00:05:04,600 --> 00:05:10,132 the SQL statement where placeholders are substitutied with bind parameter values. 55 00:05:10,133 --> 00:05:18,133 So, if we use P6Spy and persist one Post entity, the statement is going to be logged as follows. 56 00:05:18,366 --> 00:05:23,632 Notice that formatted statement includes the bind parameter values. 57 00:05:23,633 --> 00:05:28,599 Even the commit command is captured and sent to the log. 58 00:05:28,600 --> 00:05:36,600 Now, let's see what P6Spy will log if we enable JDBC batching and persist three Post entities. 59 00:05:37,733 --> 00:05:42,765 P6Spy is going to log four statements. The first three 60 00:05:42,766 --> 00:05:48,332 are associated to the Statement addBatch method call, 61 00:05:48,333 --> 00:05:52,899 hence the value of "batch" in the statement type column. 62 00:05:52,900 --> 00:05:55,199 The 4th logged statement 63 00:05:55,200 --> 00:05:59,299 is triggered by the Statement executeBatch method call, 64 00:05:59,300 --> 00:06:04,465 hence the value of "statement" in the category column. 65 00:06:04,466 --> 00:06:06,432 On the last logged statement, 66 00:06:06,433 --> 00:06:11,165 the execution time is greater than the previous ones. 67 00:06:11,166 --> 00:06:14,165 That's because only after the three statemements 68 00:06:14,166 --> 00:06:21,032 were batched at the JDBC Driver level, Hibernate can execute the batch and send it to the database. 69 00:06:21,934 --> 00:06:27,799 Datasource-proxy is also an open-source project which you can use to proxy the JDBC DataSource. 70 00:06:27,800 --> 00:06:33,165 If you're using Spring, setting up datasource-proxy is just a matter of adding one Java bean 71 00:06:33,166 --> 00:06:38,532 config which decorates a given DataSource with logging capabilities. 72 00:06:38,533 --> 00:06:41,865 For Java EE, it's much more difficult to use datasource-proxy 73 00:06:41,866 --> 00:06:46,332 because the DataSource is provided and managed by the Application Server. 74 00:06:46,333 --> 00:06:48,665 Another benefit of using datasource-proxy 75 00:06:48,666 --> 00:06:55,632 is that you can supply your own custom statement execution listeners, like a statement count assertion mechanism. 76 00:06:55,633 --> 00:06:58,165 So, if we use datasource-proxy 77 00:06:58,166 --> 00:07:03,132 and persist one Post entity, the statement is going to be logged as follows. 78 00:07:03,133 --> 00:07:09,299 The prepared statement is printed along with the current executing bind parameter values. 79 00:07:09,300 --> 00:07:16,765 When rerunning the previous example which was batching three Post entities, datasource-proxy will log the following message. 80 00:07:16,766 --> 00:07:23,132 We can see that the statement was batched and it took three sets of bind parameter values. 81 00:07:23,133 --> 00:07:29,232 And, just like P6Spy, datasource-proxy prints the statement execution time as well. 82 00:07:30,666 --> 00:07:32,365 Testing time. 83 00:07:32,366 --> 00:07:40,232 In my high-performance-java-persistence GitHub repository, you can find the P6SpyTest. Configuring P6Spy is very easy. 84 00:07:40,233 --> 00:07:45,499 All you need to is to wrap the actual DataSource with the P6DataSource decorator, 85 00:07:45,500 --> 00:07:49,865 which we can see here. 86 00:07:49,866 --> 00:07:57,099 Behind the scenes, P6Spy relies on the spy. properties configuration file which is located in our resources folder. 87 00:07:57,100 --> 00:08:00,165 And looks as follows. 88 00:08:00,166 --> 00:08:04,699 Using the spy. properties file, you can customize the way P6Spy works 89 00:08:04,700 --> 00:08:07,365 like the logging framework of choice. 90 00:08:07,366 --> 00:08:13,099 Now, if we go back to our test and we are running the test batch 91 00:08:13,100 --> 00:08:14,765 method 92 00:08:14,766 --> 00:08:22,766 We can see that P6Spy is going to log four statements. 93 00:08:23,933 --> 00:08:28,299 The first three were trigerred by the addBatch method execution. 94 00:08:28,300 --> 00:08:32,665 The 4th one was written when the executeBatch method was called. 95 00:08:32,666 --> 00:08:35,899 And that's it. It's actually very easy to use P6Spy. 96 00:08:37,200 --> 00:08:44,165 To see how datasource-proxy works, just open the DataSourceProxyTest in my GitHub repository. 97 00:08:44,166 --> 00:08:46,965 This time the configuration is done programmatically 98 00:08:46,966 --> 00:08:49,865 as you can see here. The actual DataSource 99 00:08:49,866 --> 00:08:52,765 is wrapped by the datasource-proxy 100 00:08:52,766 --> 00:08:57,999 and we are using SLF4J for logging the statements. 101 00:08:58,000 --> 00:09:00,965 Now, if we want to run 102 00:09:00,966 --> 00:09:06,132 the testBatch which inserts three Post entities 103 00:09:06,133 --> 00:09:13,199 we can see that datasource-proxy is going to log a single statement. 104 00:09:13,200 --> 00:09:18,199 Here. 105 00:09:18,200 --> 00:09:20,899 There are three sets of parameters 106 00:09:20,900 --> 00:09:23,432 that are being logged, and that's it! 107 00:09:23,433 --> 00:09:26,899 Setting up datasource-proxy is actually very easy. 108 00:09:27,701 --> 00:09:34,265 Previously, I said that using a DataSource Proxy is much more flexible than the default Hibernate logging mechanism. 109 00:09:34,266 --> 00:09:37,065 This is because, when using a DataSource Proxy, 110 00:09:37,066 --> 00:09:45,066 we can easily supply our own statement-based listeners and validate the number of statements being executed. Let's say we have two entities. 111 00:09:45,266 --> 00:09:47,899 A parent Post entity 112 00:09:47,900 --> 00:09:50,265 and a child PostComment 113 00:09:50,266 --> 00:09:55,065 which has a @ManyToOne association to the Post entity. 114 00:09:55,600 --> 00:10:00,265 Now, in our init method, we are going to add two Post 115 00:10:00,266 --> 00:10:02,765 and two PostComment entities, 116 00:10:02,766 --> 00:10:10,032 and in our test method, we are going to run a single JPQL query which is going to select all the PostComment entities 117 00:10:10,033 --> 00:10:14,032 and we can expect a single SQL statement 118 00:10:14,033 --> 00:10:19,265 to be executed. 119 00:10:19,266 --> 00:10:25,899 And that's exactly what we get. 120 00:10:26,534 --> 00:10:30,499 However, if some other developer comes in and tries to change 121 00:10:30,500 --> 00:10:33,032 the @ManyToOne fetch startegy 122 00:10:33,033 --> 00:10:41,033 from LAZY to EAGER 123 00:10:42,300 --> 00:10:48,532 and rerun the test 124 00:10:48,533 --> 00:10:51,599 we can see now that the test is going to fail, 125 00:10:51,600 --> 00:10:53,399 and instead of one statement 126 00:10:53,400 --> 00:11:01,400 three statements are going to be executed. The two extra secondary queries are executed because the "post" association needs to be initialized 127 00:11:02,566 --> 00:11:10,365 eagerly and we can see that in the log. 128 00:11:10,366 --> 00:11:11,999 Here are the two extra 129 00:11:12,000 --> 00:11:15,065 secondary statements that were executed. 130 00:11:15,933 --> 00:11:23,165 Asserting the number of statements is extremely useful and advisable to do on every project that uses JPA and Hibernate. 131 00:11:23,166 --> 00:11:26,065 Just like you assert the number of comments being extracted, 132 00:11:26,066 --> 00:11:29,699 you should assert the number of statements being executed as well. 133 00:11:29,700 --> 00:11:34,465 This way, you can make sure that the code you are testing now is going to work the same when you 134 00:11:34,466 --> 00:11:39,232 deploy it into production. And if someone else comes in and changes the fetch strategy 135 00:11:39,233 --> 00:11:46,765 of your entities, you're going to be notified by the statement count validator because tests are going to fail. This works as follows. 136 00:11:57,166 --> 00:12:05,166 The SQL statement count validator relies on datasource-proxy QueryCountHolder which counts all the statements that were intercepted. 137 00:12:06,533 --> 00:12:14,533 The reset method clears the SELECT, INSERT, UPDATE, and DELETE counters. For every SQL statement type, there is an associated assert count method. 138 00:12:19,866 --> 00:12:24,765 If they expected statement count is different than the one being recorded by the datasource-proxy, 139 00:12:24,766 --> 00:12:28,632 a Java Exception is going to be thrown, and the test will fail. 140 00:12:28,633 --> 00:12:33,632 After executing a certain data access logic, you can validate the number of statement you expect to 141 00:12:33,633 --> 00:12:38,665 have been executed since the last time you called reset. 142 00:12:38,666 --> 00:12:40,632 So, here you call reset, 143 00:12:40,633 --> 00:12:45,532 and here you assert the number of statements that must be executed. 144 00:12:45,533 --> 00:12:48,533 It's easy, yet very powerful.