1
00:00:01,466 --> 00:00:08,099
Welcome to a new episode of High-Performance
Java Persistence video course.

2
00:00:08,100 --> 00:00:14,299
In this lesson, you are going to find out why JDBC
logging is very important when using JPA and Hibernate,

3
00:00:14,300 --> 00:00:20,332
what configuration does Hibernate have to offer to
customize statement logging, and why a JDBC DataSource

4
00:00:20,333 --> 00:00:26,399
or Driver Proxy is a much more flexible
alternative to Hibernate logging.

5
00:00:26,400 --> 00:00:29,765
First of all, if you are working
with JPA and Hibernate,

6
00:00:29,766 --> 00:00:37,032
then you need statement logging. Unlike plain JDBC, where you have
absolute control over what statement are sent to the database,

7
00:00:37,033 --> 00:00:44,999
JPA and Hibernate generate statements based on various entity state transitions
like persist, merge or remove. If you don't use statement logging,

8
00:00:45,000 --> 00:00:52,999
then you will never know what Hibernate executes on your behalf. And
that's bad because, if you don't validate the auto-generated statements,

9
00:00:53,000 --> 00:01:01,000
you might bump into various data access performance issues. In this
episode, you are going to see how we can automate the statement

10
00:01:01,866 --> 00:01:09,866
validation process so that your initial data access assumptions
still hold even if the application code evolves with time.

11
00:01:10,766 --> 00:01:18,665
Hibernate offers three log-based configuration
properties: show_sql, format_sql, and use_sql_comments.

12
00:01:18,666 --> 00:01:25,499
The first one, show_sql, is to be avoided. That's
because it prints the statements to the Console.

13
00:01:25,500 --> 00:01:32,132
And, for the same reasons you are using a Logging framework like Log4j
or Logback instead of printing to System. out, you have to use a

14
00:01:32,133 --> 00:01:36,599
a logging framework for Hibernate-generated
SQL statements as well.

15
00:01:36,600 --> 00:01:43,732
The second one, format_sql, can be used to print
the SQL statement using a multi-line formatter.

16
00:01:43,733 --> 00:01:50,999
The third one, use_sql_comments, causes Hibernate
to annotate SQL statements with various comments

17
00:01:51,000 --> 00:01:55,465
which can help you to understand why a given
statement was generated in a certain way.

18
00:01:57,033 --> 00:02:05,033
Behind the scenes, Hibernate uses the JBoss Logging framework which acts
as a bridge to the actual logging framework in use by our enterprise

19
00:02:05,800 --> 00:02:13,800
application. So, you can use any logging framework you want
like Log4j or Log4j 2, Logback via SLF4J or Java Logging.

20
00:02:14,600 --> 00:02:15,799
Now,

21
00:02:15,800 --> 00:02:21,632
when persisting a Post entity, Hibernate is going
to output the following SQL INSERT statement.

22
00:02:21,633 --> 00:02:25,299
By default, statements are
logged using placeholders.

23
00:02:25,300 --> 00:02:30,432
Later, we are going to see how
to print bind variables as well.

24
00:02:30,433 --> 00:02:35,265
But, prior to doing that, let's see what
happens if we activate statement formatting.

25
00:02:35,266 --> 00:02:43,099
The format_sql setting takes effect prior to printing the SQL statement. So,
it's only applicable for Hibernate statement logging. It will not work if

26
00:02:43,100 --> 00:02:48,365
you are using an external
statement logging framework.

27
00:02:48,366 --> 00:02:54,099
When you enable the format_sql
configuration property and persist the

28
00:02:54,100 --> 00:02:59,865
same Post entity, you are going to see the following
SQL INSERT statement printed in your log.

29
00:02:59,866 --> 00:03:07,199
When you enable statement comments, Hibernate is going to append
various SQL-level comments when generating the statement like

30
00:03:07,200 --> 00:03:12,132
the entity state transition that triggered
the current executing statement,

31
00:03:12,133 --> 00:03:16,699
the reason for choosing a join
when fetching a given result set,

32
00:03:16,700 --> 00:03:21,265
or the explicit locking mechanism
employed by the current statement.

33
00:03:21,266 --> 00:03:25,465
Unlike statement formatting, comments
are sent to the JDBC driver.

34
00:03:25,466 --> 00:03:30,265
Because it adds an extra overhead to the
networking I/O, you don't want to leave

35
00:03:30,266 --> 00:03:35,099
this configuration enabled in
a production environement.

36
00:03:35,100 --> 00:03:38,632
If you want to log the Prepared
Statement bind parameter values,

37
00:03:38,633 --> 00:03:44,599
you have to add a new appender for the
SQL type descriptor Hibernate package.

38
00:03:44,600 --> 00:03:51,799
Then, Hibernate is going to log both the type and
the value used for each particular bind parameter.

39
00:03:52,701 --> 00:03:58,932
Even if Hibernate logging allows you to print both the
prepared statements and their runtime bind parameter values,

40
00:03:58,933 --> 00:04:04,365
an external JDBC Statement Proxy is
usually preferred. Either the JDBC Driver

41
00:04:04,366 --> 00:04:09,065
or the DataSource can be proxied and, this way,
we can intercept any statement execution and

42
00:04:09,066 --> 00:04:13,799
log more than just the statement
and its bind parameter values.

43
00:04:13,800 --> 00:04:21,800
Besides custom statement logging, a JDBC Proxy can provide other
cross-cutting concerns like detecting long-running statements

44
00:04:21,966 --> 00:04:26,065
or validating the number of
statements being executed.

45
00:04:26,066 --> 00:04:34,066
P6Spy is an open-source framework that supports a declarative
configuration approach via an external properties file. You can use

46
00:04:34,166 --> 00:04:39,565
P6Spy to proxy either the JDBC Driver,
which is very suitable for Java EE

47
00:04:39,566 --> 00:04:41,299
or the JDBC DataSource,

48
00:04:41,300 --> 00:04:45,899
which is usually the case for
Spring-based enterprise applications.

49
00:04:45,900 --> 00:04:53,065
P6Spy prints more than just the SQL statement
itself. It can capture the current timestamp,

50
00:04:53,066 --> 00:04:55,299
the statement execution time,

51
00:04:55,300 --> 00:04:57,832
the statement type: simple or batched,

52
00:04:57,833 --> 00:05:01,465
the database connection used by
the current executing statement,

53
00:05:01,466 --> 00:05:04,599
the original statememnt with placeholders,

54
00:05:04,600 --> 00:05:10,132
the SQL statement where placeholders are
substitutied with bind parameter values.

55
00:05:10,133 --> 00:05:18,133
So, if we use P6Spy and persist one Post entity,
the statement is going to be logged as follows.

56
00:05:18,366 --> 00:05:23,632
Notice that formatted statement
includes the bind parameter values.

57
00:05:23,633 --> 00:05:28,599
Even the commit command is
captured and sent to the log.

58
00:05:28,600 --> 00:05:36,600
Now, let's see what P6Spy will log if we enable
JDBC batching and persist three Post entities.

59
00:05:37,733 --> 00:05:42,765
P6Spy is going to log four statements.
The first three

60
00:05:42,766 --> 00:05:48,332
are associated to the Statement
addBatch method call,

61
00:05:48,333 --> 00:05:52,899
hence the value of "batch" in
the statement type column.

62
00:05:52,900 --> 00:05:55,199
The 4th logged statement

63
00:05:55,200 --> 00:05:59,299
is triggered by the Statement
executeBatch method call,

64
00:05:59,300 --> 00:06:04,465
hence the value of "statement"
in the category column.

65
00:06:04,466 --> 00:06:06,432
On the last logged statement,

66
00:06:06,433 --> 00:06:11,165
the execution time is greater
than the previous ones.

67
00:06:11,166 --> 00:06:14,165
That's because only after
the three statemements

68
00:06:14,166 --> 00:06:21,032
were batched at the JDBC Driver level, Hibernate
can execute the batch and send it to the database.

69
00:06:21,934 --> 00:06:27,799
Datasource-proxy is also an open-source project
which you can use to proxy the JDBC DataSource.

70
00:06:27,800 --> 00:06:33,165
If you're using Spring, setting up datasource-proxy
is just a matter of adding one Java bean

71
00:06:33,166 --> 00:06:38,532
config which decorates a given
DataSource with logging capabilities.

72
00:06:38,533 --> 00:06:41,865
For Java EE, it's much more
difficult to use datasource-proxy

73
00:06:41,866 --> 00:06:46,332
because the DataSource is provided and
managed by the Application Server.

74
00:06:46,333 --> 00:06:48,665
Another benefit of using datasource-proxy

75
00:06:48,666 --> 00:06:55,632
is that you can supply your own custom statement execution
listeners, like a statement count assertion mechanism.

76
00:06:55,633 --> 00:06:58,165
So, if we use datasource-proxy

77
00:06:58,166 --> 00:07:03,132
and persist one Post entity, the statement
is going to be logged as follows.

78
00:07:03,133 --> 00:07:09,299
The prepared statement is printed along with
the current executing bind parameter values.

79
00:07:09,300 --> 00:07:16,765
When rerunning the previous example which was batching three
Post entities, datasource-proxy will log the following message.

80
00:07:16,766 --> 00:07:23,132
We can see that the statement was batched and
it took three sets of bind parameter values.

81
00:07:23,133 --> 00:07:29,232
And, just like P6Spy, datasource-proxy
prints the statement execution time as well.

82
00:07:30,666 --> 00:07:32,365
Testing time.

83
00:07:32,366 --> 00:07:40,232
In my high-performance-java-persistence GitHub repository,
you can find the P6SpyTest. Configuring P6Spy is very easy.

84
00:07:40,233 --> 00:07:45,499
All you need to is to wrap the actual
DataSource with the P6DataSource decorator,

85
00:07:45,500 --> 00:07:49,865
which we can see here.

86
00:07:49,866 --> 00:07:57,099
Behind the scenes, P6Spy relies on the spy. properties
configuration file which is located in our resources folder.

87
00:07:57,100 --> 00:08:00,165
And looks as follows.

88
00:08:00,166 --> 00:08:04,699
Using the spy. properties file, you
can customize the way P6Spy works

89
00:08:04,700 --> 00:08:07,365
like the logging framework of choice.

90
00:08:07,366 --> 00:08:13,099
Now, if we go back to our test and
we are running the test batch

91
00:08:13,100 --> 00:08:14,765
method

92
00:08:14,766 --> 00:08:22,766
We can see that P6Spy is going
to log four statements.

93
00:08:23,933 --> 00:08:28,299
The first three were trigerred by
the addBatch method execution.

94
00:08:28,300 --> 00:08:32,665
The 4th one was written when the
executeBatch method was called.

95
00:08:32,666 --> 00:08:35,899
And that's it. It's actually
very easy to use P6Spy.

96
00:08:37,200 --> 00:08:44,165
To see how datasource-proxy works, just open the
DataSourceProxyTest in my GitHub repository.

97
00:08:44,166 --> 00:08:46,965
This time the configuration
is done programmatically

98
00:08:46,966 --> 00:08:49,865
as you can see here. The actual DataSource

99
00:08:49,866 --> 00:08:52,765
is wrapped by the datasource-proxy

100
00:08:52,766 --> 00:08:57,999
and we are using SLF4J for
logging the statements.

101
00:08:58,000 --> 00:09:00,965
Now, if we want to run

102
00:09:00,966 --> 00:09:06,132
the testBatch which inserts
three Post entities

103
00:09:06,133 --> 00:09:13,199
we can see that datasource-proxy is
going to log a single statement.

104
00:09:13,200 --> 00:09:18,199
Here.

105
00:09:18,200 --> 00:09:20,899
There are three sets of parameters

106
00:09:20,900 --> 00:09:23,432
that are being logged, and that's it!

107
00:09:23,433 --> 00:09:26,899
Setting up datasource-proxy
is actually very easy.

108
00:09:27,701 --> 00:09:34,265
Previously, I said that using a DataSource Proxy is much
more flexible than the default Hibernate logging mechanism.

109
00:09:34,266 --> 00:09:37,065
This is because, when
using a DataSource Proxy,

110
00:09:37,066 --> 00:09:45,066
we can easily supply our own statement-based listeners and validate the
number of statements being executed. Let's say we have two entities.

111
00:09:45,266 --> 00:09:47,899
A parent Post entity

112
00:09:47,900 --> 00:09:50,265
and a child PostComment

113
00:09:50,266 --> 00:09:55,065
which has a @ManyToOne
association to the Post entity.

114
00:09:55,600 --> 00:10:00,265
Now, in our init method, we
are going to add two Post

115
00:10:00,266 --> 00:10:02,765
and two PostComment entities,

116
00:10:02,766 --> 00:10:10,032
and in our test method, we are going to run a single JPQL
query which is going to select all the PostComment entities

117
00:10:10,033 --> 00:10:14,032
and we can expect a single SQL statement

118
00:10:14,033 --> 00:10:19,265
to be executed.

119
00:10:19,266 --> 00:10:25,899
And that's exactly what we get.

120
00:10:26,534 --> 00:10:30,499
However, if some other developer
comes in and tries to change

121
00:10:30,500 --> 00:10:33,032
the @ManyToOne fetch startegy

122
00:10:33,033 --> 00:10:41,033
from LAZY to EAGER

123
00:10:42,300 --> 00:10:48,532
and rerun the test

124
00:10:48,533 --> 00:10:51,599
we can see now that the
test is going to fail,

125
00:10:51,600 --> 00:10:53,399
and instead of one statement

126
00:10:53,400 --> 00:11:01,400
three statements are going to be executed. The two extra secondary queries
are executed because the "post" association needs to be initialized

127
00:11:02,566 --> 00:11:10,365
eagerly and we can see that in the log.

128
00:11:10,366 --> 00:11:11,999
Here are the two extra

129
00:11:12,000 --> 00:11:15,065
secondary statements that were executed.

130
00:11:15,933 --> 00:11:23,165
Asserting the number of statements is extremely useful and
advisable to do on every project that uses JPA and Hibernate.

131
00:11:23,166 --> 00:11:26,065
Just like you assert the number
of comments being extracted,

132
00:11:26,066 --> 00:11:29,699
you should assert the number of
statements being executed as well.

133
00:11:29,700 --> 00:11:34,465
This way, you can make sure that the code you are
testing now is going to work the same when you

134
00:11:34,466 --> 00:11:39,232
deploy it into production. And if someone
else comes in and changes the fetch strategy

135
00:11:39,233 --> 00:11:46,765
of your entities, you're going to be notified by the statement count
validator because tests are going to fail. This works as follows.

136
00:11:57,166 --> 00:12:05,166
The SQL statement count validator relies on datasource-proxy
QueryCountHolder which counts all the statements that were intercepted.

137
00:12:06,533 --> 00:12:14,533
The reset method clears the SELECT, INSERT, UPDATE, and DELETE counters.
For every SQL statement type, there is an associated assert count method.

138
00:12:19,866 --> 00:12:24,765
If they expected statement count is different than
the one being recorded by the datasource-proxy,

139
00:12:24,766 --> 00:12:28,632
a Java Exception is going to be
thrown, and the test will fail.

140
00:12:28,633 --> 00:12:33,632
After executing a certain data access logic, you
can validate the number of statement you expect to

141
00:12:33,633 --> 00:12:38,665
have been executed since the
last time you called reset.

142
00:12:38,666 --> 00:12:40,632
So, here you call reset,

143
00:12:40,633 --> 00:12:45,532
and here you assert the number of
statements that must be executed.

144
00:12:45,533 --> 00:12:48,533
It's easy, yet very powerful.