1
0
Fork 0
mirror of https://github.com/MatomoCamp/recording-subtitles.git synced 2024-09-19 16:03:52 +02:00
recording-subtitles/2021/Break Matomo/output.srt
Lukas Winkler 9f4c32d3a4
add a lot more subtitles
Thanks to OpenAI whisper and @simon-ast
2022-10-26 18:15:02 +02:00

1484 lines
36 KiB
Text

1
00:00:00,000 --> 00:00:10,240
Okay, and with this, I would like to start the next talk.
2
00:00:10,240 --> 00:00:14,800
It is once again for Thomas Persson, like I said before, but just for the recordings.
3
00:00:14,800 --> 00:00:19,560
He's a business developer at Digitalist Sweden, and he has worked with tracking digital analytics
4
00:00:19,560 --> 00:00:26,560
since 2010, and has been a contributor to open source since 2007.
5
00:00:26,560 --> 00:00:31,200
And in this talk, how to break Matomo and also some inputs on how to fix it, Thomas
6
00:00:31,200 --> 00:00:36,320
will share his experience with us with running several large Matomo instances.
7
00:00:36,320 --> 00:00:41,400
As most people know, Matomo can be pretty slow when the data grows, and Thomas will
8
00:00:41,400 --> 00:00:46,160
share some of the biggest mistakes and also talk about how they manage them.
9
00:00:46,160 --> 00:00:47,160
And thank you.
10
00:00:47,160 --> 00:00:48,160
Please take it away.
11
00:00:48,160 --> 00:00:49,160
Thank you.
12
00:00:49,160 --> 00:00:56,200
So, yeah, sharing pains is usually good learning.
13
00:00:56,200 --> 00:01:04,280
And I hope you guys can ask a lot of questions, and, yeah, we will hopefully get some interaction
14
00:01:04,280 --> 00:01:06,040
going on as well.
15
00:01:06,040 --> 00:01:12,640
So this is actually based on my learnings, or our learnings, I should say, because it's
16
00:01:12,640 --> 00:01:17,640
not only me at Digitalist working with Matomo, of course, but some things we learned during
17
00:01:17,640 --> 00:01:29,120
the last four years when we focused pretty intense on maintaining Matomo for a lot of
18
00:01:29,120 --> 00:01:30,120
clients.
19
00:01:30,120 --> 00:01:36,320
So let's start off by doing the short intro.
20
00:01:36,320 --> 00:01:43,920
I'm a generalist, but in this case, I kind of had a background in the 90s where I actually
21
00:01:43,920 --> 00:01:51,160
worked with hosting and maintaining the servers that were really physical and heavy and large
22
00:01:51,160 --> 00:01:52,160
at that time.
23
00:01:52,160 --> 00:01:59,120
So that knowledge I've kind of brought with me during the years and actually got to use
24
00:01:59,120 --> 00:02:07,120
pretty intensively while learning and managing Matomo for our large clients.
25
00:02:07,120 --> 00:02:14,280
Did you listen to my colleague Holger yesterday that talked about how we technically maintain
26
00:02:14,280 --> 00:02:20,600
very large installations of Matomo just to get an idea of the size?
27
00:02:20,600 --> 00:02:30,840
I think we have databases that go above a bit of a terabyte of data, and we collect
28
00:02:30,840 --> 00:02:37,960
up to between 50 and 100 million actions per month stored in these databases.
29
00:02:37,960 --> 00:02:44,880
So they're quite big, and we maintain them in a setup using Kubernetes where we can scale
30
00:02:44,880 --> 00:02:47,040
things up and down.
31
00:02:47,040 --> 00:02:52,480
Obviously not the most standard setup, but I will not talk so much about Kubernetes today,
32
00:02:52,480 --> 00:02:59,760
more about the practical issues and the configurations that we can share with you guys.
33
00:02:59,760 --> 00:03:07,240
So this is a scenario that perhaps many of you have seen.
34
00:03:07,240 --> 00:03:14,200
We get different types of error messages or reports that are slow in Matomo, and obviously
35
00:03:14,200 --> 00:03:19,480
our users can be pretty sad when that happens.
36
00:03:19,480 --> 00:03:26,720
If you've been working with Matomo, you probably have these experiences over time.
37
00:03:26,720 --> 00:03:34,980
So my pro tips if you want to have this scenario is I have three main principle guidelines
38
00:03:34,980 --> 00:03:38,320
that you can follow if you want to have these problems.
39
00:03:38,320 --> 00:03:44,040
The first pro tips is to give your users in Matomo a lot of permissions from start without
40
00:03:44,040 --> 00:03:46,240
training them.
41
00:03:46,240 --> 00:03:51,740
This I can guarantee you, you will run into a lot of nice issues and problems.
42
00:03:51,740 --> 00:03:58,320
My second pro tips is to write bad applications and start tracking them in Matomo, and then
43
00:03:58,320 --> 00:04:03,680
of course blame Matomo for the problems that you end up having.
44
00:04:03,680 --> 00:04:07,080
We'll look in a bit more to this later on.
45
00:04:07,080 --> 00:04:13,640
And my number three pro tips is to let your IT department host Matomo without proper Matomo
46
00:04:13,640 --> 00:04:14,640
training.
47
00:04:14,640 --> 00:04:20,520
We'll also discuss a little bit in detail what this might lead to.
48
00:04:20,520 --> 00:04:28,400
Of course, I'm a bit in a joking mode here, but I'll try to give you some examples of
49
00:04:28,400 --> 00:04:31,400
why I give you these pro tips.
50
00:04:31,400 --> 00:04:39,080
So the first one about giving Matomo users too much permissions, Matomo is a quite complex
51
00:04:39,080 --> 00:04:40,080
tool.
52
00:04:40,080 --> 00:04:47,760
Of course, setting up tracking is something that requires quite a lot of skills in web
53
00:04:47,760 --> 00:04:54,960
development and also JavaScript, and you also need to know how the product works to do this
54
00:04:54,960 --> 00:04:55,960
properly.
55
00:04:55,960 --> 00:05:06,080
Otherwise, you will have bad data in your database and the reports will be hard to understand.
56
00:05:06,080 --> 00:05:13,960
Also creating segments or custom reports has to be pretty, you have to know what you're
57
00:05:13,960 --> 00:05:18,360
doing, so to speak, because they can actually kill your performance.
58
00:05:18,360 --> 00:05:21,280
I will give you examples of this later on.
59
00:05:21,280 --> 00:05:27,200
But also doing this wrong might lead to really, really bad user experiences because you need
60
00:05:27,200 --> 00:05:32,520
to know what you are doing and what you're showing to your users.
61
00:05:32,520 --> 00:05:38,180
So you should be really careful about giving away permissions and make sure that users
62
00:05:38,180 --> 00:05:42,480
that can create segments, for instance, know what they're doing.
63
00:05:42,480 --> 00:05:48,880
The last one, or the second one, write bad applications and blame Matomo for the issues.
64
00:05:48,880 --> 00:05:56,760
So this is actually something you need to be aware of that Matomo is just a stupid system.
65
00:05:56,760 --> 00:06:01,000
Whatever you send to Matomo will be eaten by this system.
66
00:06:01,000 --> 00:06:07,320
So you can't really fix things that is broken in your application or actually can, and I
67
00:06:07,320 --> 00:06:13,720
will give you some examples of that, but you need to have the mentality of understanding
68
00:06:13,720 --> 00:06:16,760
that Matomo is just a stupid system.
69
00:06:16,760 --> 00:06:22,680
So if you send bad data into Matomo, this could be personal data or whatever, Matomo
70
00:06:22,680 --> 00:06:28,560
will just eat that and you will store it and you will have problems in different ways.
71
00:06:28,560 --> 00:06:35,920
The two most important things to manage properly are URLs and titles because these two will
72
00:06:35,920 --> 00:06:47,720
always be just stored in Matomo when we send it.
73
00:06:47,720 --> 00:06:56,960
And URLs is actually pretty important to manage in a large installation in particular.
74
00:06:56,960 --> 00:07:02,800
So the third guideline, then, let your IT department host Matomo without proper training.
75
00:07:02,800 --> 00:07:11,480
So this goes for many systems, not just Matomo, but why it fails.
76
00:07:11,480 --> 00:07:15,960
And I actually have lots of experiences with it.
77
00:07:15,960 --> 00:07:23,520
We help a lot of Matomo clients and they often approach us because they want help with implementing
78
00:07:23,520 --> 00:07:26,380
tracking or understanding the application.
79
00:07:26,380 --> 00:07:31,880
But usually I spend about a week with the IT department in the beginning just to fix
80
00:07:31,880 --> 00:07:37,920
things that hasn't been properly set up or taken care of.
81
00:07:37,920 --> 00:07:44,800
Because maintaining Matomo or any application actually requires you to know the application.
82
00:07:44,800 --> 00:07:50,200
It's not enough to just set up a server and install the software and then you're ready.
83
00:07:50,200 --> 00:07:57,240
And Matomo is really, really a tool that you need to know to maintain properly because
84
00:07:57,240 --> 00:08:01,720
it needs constant updates and testing comes with that.
85
00:08:01,720 --> 00:08:07,880
When we have a new release of Matomo, you really need to test this and know and understand
86
00:08:07,880 --> 00:08:14,000
what was actually installed and how will that affect our instance of Matomo.
87
00:08:14,000 --> 00:08:20,920
Even though we do this 24-7, we work with Matomo and learn, we still have problems when
88
00:08:20,920 --> 00:08:27,720
there are new updates because you don't always know exactly what will happen in your Matomo
89
00:08:27,720 --> 00:08:28,720
environment.
90
00:08:28,720 --> 00:08:34,640
And this is, of course, since it's a complex environment, you have plugins, you have different
91
00:08:34,640 --> 00:08:40,080
integrations and so on and every setup is unique.
92
00:08:40,080 --> 00:08:45,560
We need to have constant monitoring in place and monitoring in this case is not only about
93
00:08:45,560 --> 00:08:55,720
does Matomo respond, it can be monitoring individual reports or monitoring your database
94
00:08:55,720 --> 00:09:00,000
or monitoring for errors really connected to Matomo.
95
00:09:00,000 --> 00:09:03,120
This is also something you need to learn over time.
96
00:09:03,120 --> 00:09:06,960
Constant optimization, your database will probably grow.
97
00:09:06,960 --> 00:09:13,080
I will show you examples of that and that means you will continuously need to optimize
98
00:09:13,080 --> 00:09:21,080
and again, constantly change your configurations depending on the needs of your users.
99
00:09:21,080 --> 00:09:31,060
So again, feel free to do this internally but make sure that the people that is taking
100
00:09:31,060 --> 00:09:35,520
care of Matomo has this as their focus.
101
00:09:35,520 --> 00:09:43,440
It's a huge difference to just set up a server and install a software and maintaining an
102
00:09:43,440 --> 00:09:46,800
application properly.
103
00:09:46,800 --> 00:09:54,680
So let's look into some of the more specific problems that are related to users.
104
00:09:54,680 --> 00:10:05,600
I talked a little bit about segments and custom reports because these both affect the aggregation
105
00:10:05,600 --> 00:10:07,320
in Matomo.
106
00:10:07,320 --> 00:10:14,880
Aggregation is the jobs that happens under the hood to produce the fast reports in the
107
00:10:14,880 --> 00:10:17,120
interface.
108
00:10:17,120 --> 00:10:22,880
So if you go to the page views report, that is actually an aggregation of the raw data
109
00:10:22,880 --> 00:10:31,200
and to produce that, there are jobs running under the hood in Matomo and they will require
110
00:10:31,200 --> 00:10:32,880
more server power.
111
00:10:32,880 --> 00:10:38,120
So every time you create a new segment or a custom report, you actually add a new job
112
00:10:38,120 --> 00:10:40,880
to Matomo to perform.
113
00:10:40,880 --> 00:10:45,880
I've seen recommendations in the community and we try to have these recommendations
114
00:10:45,880 --> 00:10:51,400
as well to not set up more than a hundred segments for a site and a hundred segments
115
00:10:51,400 --> 00:10:57,880
to be honest can be quite a lot for your service to maintain as well.
116
00:10:57,880 --> 00:11:01,440
So yeah, keep that in mind.
117
00:11:01,440 --> 00:11:08,880
One thing we learned really, really the hard way is that if you create segments in Matomo
118
00:11:08,880 --> 00:11:18,200
and you use these kind of wild card questions in your definitions, they have a large impact
119
00:11:18,200 --> 00:11:25,760
on your database performance because what happens is you see the SQL query that is created.
120
00:11:25,760 --> 00:11:29,040
This is not the exact query, but it's just an example.
121
00:11:29,040 --> 00:11:33,920
The important part is the search string at the end.
122
00:11:33,920 --> 00:11:42,320
When you send a query like that, which happens if you use the contains setting is that we're
123
00:11:42,320 --> 00:11:48,920
not able to use indexes in Matomo and when you create segments, these queries goes to
124
00:11:48,920 --> 00:11:58,680
one of the largest tables in the Matomo database and indexes is a way to make queries faster
125
00:11:58,680 --> 00:12:00,520
basically in MySQL.
126
00:12:00,520 --> 00:12:04,280
But these type of queries cannot use indexes.
127
00:12:04,280 --> 00:12:10,280
So that means that they have to do a full table scan, which is kind of technical definition
128
00:12:10,280 --> 00:12:16,720
here and that will really, really kill your performance if you have a lot of data.
129
00:12:16,720 --> 00:12:23,720
We're talking about minutes to run SQL queries if you have a large database instead of maybe
130
00:12:23,720 --> 00:12:26,360
seconds if you have a proper index in place.
131
00:12:26,360 --> 00:12:32,280
So the impact here can be massive and especially if you have a lot of these segments in place
132
00:12:32,280 --> 00:12:36,000
on a large database.
133
00:12:36,000 --> 00:12:44,320
So instead, you should try to change your segments into using starts with, for instance.
134
00:12:44,320 --> 00:12:51,120
If you do that, your query will be different and MySQL will actually be able to perform
135
00:12:51,120 --> 00:12:53,760
these queries a lot faster.
136
00:12:53,760 --> 00:13:01,360
So it's quite important actually on a large data set to manage this.
137
00:13:01,360 --> 00:13:13,720
Yes, so with this in mind, make sure that only users with this knowledge can create
138
00:13:13,720 --> 00:13:14,720
segments.
139
00:13:14,720 --> 00:13:20,920
And this will also lead to that we avoid bad configurations because I see a lot of users
140
00:13:20,920 --> 00:13:26,800
that actually use segments in the wrong way.
141
00:13:26,800 --> 00:13:31,640
I won't go into too much about that, or I can.
142
00:13:31,640 --> 00:13:37,280
The example is if you want to filter out pages for a specific section of your site, I often
143
00:13:37,280 --> 00:13:44,000
see users creating segments for this because they misunderstand how segments work.
144
00:13:44,000 --> 00:13:51,360
This is a way to create a bucket of users matching certain criteria.
145
00:13:51,360 --> 00:13:56,320
So even though you will filter out page views for that particular section, you will also
146
00:13:56,320 --> 00:14:04,640
filter out page views for other sections that the users match in your segment criteria matched.
147
00:14:04,640 --> 00:14:07,320
So yeah, different thing.
148
00:14:07,320 --> 00:14:28,880
So yeah, one thing I often see is also that users that create segments, you can do it
149
00:14:28,880 --> 00:14:31,240
on a personal level.
150
00:14:31,240 --> 00:14:33,040
So only you see the segment.
151
00:14:33,040 --> 00:14:38,080
As an administrator and particularly on sites with a lot of users, I can see that sometimes
152
00:14:38,080 --> 00:14:45,640
there are up to ten segments that are almost the same thing because users create them under
153
00:14:45,640 --> 00:14:46,640
the hood.
154
00:14:46,640 --> 00:14:53,760
And this is also bad because that means ten jobs are running under the hood when you might
155
00:14:53,760 --> 00:14:57,480
only have needed one of them.
156
00:14:57,480 --> 00:15:00,880
So yeah, keep that in mind.
157
00:15:00,880 --> 00:15:07,160
And of course, this leads to avoiding performance issues.
158
00:15:07,160 --> 00:15:09,800
Also this is a nice one.
159
00:15:09,800 --> 00:15:21,160
Look into the global ini.php file on your server and make sure you can actually set
160
00:15:21,160 --> 00:15:25,100
this is the user role needed to add segments in Matomo.
161
00:15:25,100 --> 00:15:30,400
So I think that the default one is view here and that means that basically any user in
162
00:15:30,400 --> 00:15:33,600
your Matomo instance can create segments.
163
00:15:33,600 --> 00:15:43,220
So make sure to make it admin or super user or write or whatever your organization matches.
164
00:15:43,220 --> 00:15:46,520
So that's the good thing to look into.
165
00:15:46,520 --> 00:15:47,520
All right.
166
00:15:47,520 --> 00:15:51,240
So what about bad applications?
167
00:15:51,240 --> 00:15:59,320
Well, one thing that is really important for performance is to keep the number of URLs
168
00:15:59,320 --> 00:16:00,320
down.
169
00:16:00,320 --> 00:16:08,920
And the thing here is that if you, for instance, send, Matomo can't differ between uppercase
170
00:16:08,920 --> 00:16:09,920
and lowercase.
171
00:16:09,920 --> 00:16:16,840
So if you have a page that will report sometimes with an uppercase and sometimes with a lowercase,
172
00:16:16,840 --> 00:16:20,920
Matomo will handle that as two different URLs.
173
00:16:20,920 --> 00:16:26,840
So make sure to always send lowercase URLs.
174
00:16:26,840 --> 00:16:33,160
Make sure to not send URL parameters if you can avoid that because that will also create
175
00:16:33,160 --> 00:16:42,820
unique instances of URLs, meaning the jobs in Matomo needs to process every URL unique.
176
00:16:42,820 --> 00:16:48,160
Also some CMSs sometimes send trailing slashes.
177
00:16:48,160 --> 00:16:55,840
The example here is MyPage and that will not be the same as MyPage with an ending slash.
178
00:16:55,840 --> 00:17:01,160
Another common thing that you might want to avoid is, for instance, if you have an issue
179
00:17:01,160 --> 00:17:09,600
number with unique URLs containing numbers, you might want to look into anonymize this
180
00:17:09,600 --> 00:17:16,980
so that you don't store unnecessarily page views.
181
00:17:16,980 --> 00:17:23,900
So yeah, what I usually do is that I create, actually we show you a script that can do
182
00:17:23,900 --> 00:17:31,600
this later on, but sometimes you have an issue URL and you can end up with thousands of unique
183
00:17:31,600 --> 00:17:37,800
page views or unique URLs, but no one will never look at the unique page view for that.
184
00:17:37,800 --> 00:17:47,800
So anonymize them is actually good for performance and probably also for the analytics part to
185
00:17:47,800 --> 00:17:50,200
understand the site.
186
00:17:50,200 --> 00:17:59,200
So one way to do this, this is custom JavaScript that I've added to the site.
187
00:17:59,200 --> 00:18:00,200
It's a variable.
188
00:18:00,200 --> 00:18:06,200
So what you can do here is that you create the URL cleanup and in this example, I started
189
00:18:06,200 --> 00:18:11,680
off like I'm reading the URL from the browser and making it lowercase.
190
00:18:11,680 --> 00:18:19,040
The second one will remove trailing slashes, then I have an example of how you can anonymize
191
00:18:19,040 --> 00:18:20,740
an order ID.
192
00:18:20,740 --> 00:18:26,840
So if you have an URL saying order one, two, three, four, five, six, this will become order
193
00:18:26,840 --> 00:18:29,760
slash order page.
194
00:18:29,760 --> 00:18:35,880
And the way you use this then is you go to your page view and you use this variable in
195
00:18:35,880 --> 00:18:44,160
your custom URL and by doing so, you can actually clean your site and this is also a good way.
196
00:18:44,160 --> 00:18:49,720
If you have problematic applications, this kind of JavaScript can actually help you to
197
00:18:49,720 --> 00:18:54,560
anonymize and remove personal data if you find those kind of issues.
198
00:18:54,560 --> 00:19:01,520
So maybe you have an application that shares personal identification numbers or email addresses
199
00:19:01,520 --> 00:19:03,160
from clients or whatever.
200
00:19:03,160 --> 00:19:07,040
I've seen tons of that examples there.
201
00:19:07,040 --> 00:19:18,400
You can actually use this kind of an URL to clean things.
202
00:19:18,400 --> 00:19:25,860
Another thing that I often see and some CMS systems are really terrible here.
203
00:19:25,860 --> 00:19:29,800
So a good URL structure looks something like this.
204
00:19:29,800 --> 00:19:38,400
You have maybe URLs following your menu structure or other types of categories and you have
205
00:19:38,400 --> 00:19:41,120
a clear structure in your URLs.
206
00:19:41,120 --> 00:19:47,240
And since Matomo is a stupid system, it will just take the URLs as you send them to Matomo.
207
00:19:47,240 --> 00:19:52,760
A bad example is if your CMS will have this kind of structure.
208
00:19:52,760 --> 00:19:57,600
This is also bad if you're into SEO, of course.
209
00:19:57,600 --> 00:20:05,280
But the point here is like if you go to your page views report, you will see something
210
00:20:05,280 --> 00:20:09,240
like this if you do a nice URL structure.
211
00:20:09,240 --> 00:20:16,400
You will have aggregated numbers matching your sections.
212
00:20:16,400 --> 00:20:26,640
So this is really a nice way to or a nice feature of having nice applications.
213
00:20:26,640 --> 00:20:34,200
You can actually if you have problems, you can actually fix some problems in this one.
214
00:20:34,200 --> 00:20:43,000
So sometimes I've seen applications where we have multiple URLs for the same page, which
215
00:20:43,000 --> 00:20:44,840
is obviously bad.
216
00:20:44,840 --> 00:20:54,120
Then you can add those things into this URL cleanup JavaScript as well and maybe make
217
00:20:54,120 --> 00:21:01,560
sure that they are reporting under the same URL in your or in Matomo at least.
218
00:21:01,560 --> 00:21:04,800
There are lots of ways we can help here.
219
00:21:04,800 --> 00:21:07,320
All right.
220
00:21:07,320 --> 00:21:14,740
So let's look into optimizing Matomo from a more of a server perspective.
221
00:21:14,740 --> 00:21:22,080
So some common issues in Matomo is that our transitions report is slow if you use that
222
00:21:22,080 --> 00:21:23,080
one.
223
00:21:23,080 --> 00:21:30,120
Custom date ranges are slow because what happens if you use a custom date range is that you're
224
00:21:30,120 --> 00:21:33,640
actually querying the raw data.
225
00:21:33,640 --> 00:21:39,240
The visitor log obviously is something that can be slow.
226
00:21:39,240 --> 00:21:46,440
Sometimes you can see that today's data is not coming in as quickly as you would like
227
00:21:46,440 --> 00:21:47,680
to.
228
00:21:47,680 --> 00:21:55,640
Maybe you don't see data for today until the evening or whatever, or users get 500 errors
229
00:21:55,640 --> 00:21:58,760
or timeout errors or things like that.
230
00:21:58,760 --> 00:22:04,000
There's no general solutions here, but if you have a lot of these and this is quite
231
00:22:04,000 --> 00:22:11,560
common if your data grows, you might want to look into your server configurations.
232
00:22:11,560 --> 00:22:15,080
First of all, look into the general guidelines.
233
00:22:15,080 --> 00:22:23,400
Always start there to see if you're, are you even following the basics before you go into
234
00:22:23,400 --> 00:22:29,440
maybe my more advanced recommendations later on.
235
00:22:29,440 --> 00:22:34,880
In general, you need to understand that Matomo is a PHP application and we usually refer
236
00:22:34,880 --> 00:22:38,520
to that as a LAMP installation.
237
00:22:38,520 --> 00:22:42,440
That means we have an Apache web server.
238
00:22:42,440 --> 00:22:45,920
We have MySQL and we have PHP.
239
00:22:45,920 --> 00:22:48,840
Those are the basic main components.
240
00:22:48,840 --> 00:22:55,440
There are more and you can actually change Apache to Varnish or some other web server
241
00:22:55,440 --> 00:23:05,480
and MySQL can be MariaDB, so don't take it too much of a, yeah, it can vary.
242
00:23:05,480 --> 00:23:12,240
But when you optimize these, you need to understand what kind of needs these systems have.
243
00:23:12,240 --> 00:23:19,000
So normally Apache is a very CPU intensive system, meaning what you can optimize there
244
00:23:19,000 --> 00:23:27,520
is usually threads and sometimes a bit of a memory, but you can optimize how you use
245
00:23:27,520 --> 00:23:28,520
the CPU.
246
00:23:28,520 --> 00:23:37,480
MySQL is usually memory and disk intensive, so make sure to have fast disks and then you
247
00:23:37,480 --> 00:23:45,680
can often tweak MySQL variables and this is actually one thing when I talk to IT departments
248
00:23:45,680 --> 00:23:53,840
that has been trying to maintain Matomo, they usually install the database in kind of the
249
00:23:53,840 --> 00:24:01,240
default setup that comes with MySQL and that is probably the easiest way to make your Matomo
250
00:24:01,240 --> 00:24:09,680
server slow because if you know MySQL, you can optimize I think over 200 variables.
251
00:24:09,680 --> 00:24:18,680
You shouldn't, but you can, but MySQL application like Matomo that is very database intensive
252
00:24:18,680 --> 00:24:24,360
really, really needs a lot of optimizations and there are great tools to help you with
253
00:24:24,360 --> 00:24:25,360
that.
254
00:24:25,360 --> 00:24:35,040
PHP is of course super important and for that we usually use the memory and super important
255
00:24:35,040 --> 00:24:43,880
to cache PHP because PHP is a language that will reprocess every page every time you see
256
00:24:43,880 --> 00:24:50,480
it basically or every time a function is loaded, it's actually processed unless you use caching.
257
00:24:50,480 --> 00:24:56,360
So caching is key when it comes to PHP.
258
00:24:56,360 --> 00:25:04,000
One thing that is super nice and particularly on larger installation is to use the Q tracking
259
00:25:04,000 --> 00:25:13,840
plugin which will offload your database significantly and also give you features such as you can
260
00:25:13,840 --> 00:25:17,520
restart your Matomo installation without losing data.
261
00:25:17,520 --> 00:25:26,000
I really encourage you to look into this one if you haven't already.
262
00:25:26,000 --> 00:25:29,920
You can also configure tons of things in Matomo.
263
00:25:29,920 --> 00:25:37,920
I linked to this little file on matomo.org.
264
00:25:37,920 --> 00:25:44,040
I'm going to show you that actually.
265
00:25:44,040 --> 00:25:52,160
Let's open that file just to give you an impression but in your Matomo installation you have a
266
00:25:52,160 --> 00:25:55,520
file called global.ini.php.
267
00:25:55,520 --> 00:26:00,240
This one is actually from GitHub but just to give you an idea how much you can actually
268
00:26:00,240 --> 00:26:10,520
configure on your Matomo server or your particular Matomo instance, you can tweak a lot of things
269
00:26:10,520 --> 00:26:18,920
and this is great because it means we can adopt Matomo in tons of ways but it also means
270
00:26:18,920 --> 00:26:24,840
that you can get problems if you don't know what you're doing.
271
00:26:24,840 --> 00:26:30,960
So one thing you usually want to have control of is the number of rows that you allow in
272
00:26:30,960 --> 00:26:33,400
different reports.
273
00:26:33,400 --> 00:26:38,200
From a user perspective you normally want to tweak this upwards because I think the
274
00:26:38,200 --> 00:26:43,400
default is that you show only 500 URLs in reports.
275
00:26:43,400 --> 00:26:53,000
If you have a lot of URLs you obviously want more than 500 but setting them too high and
276
00:26:53,000 --> 00:26:56,200
that means affecting performance.
277
00:26:56,200 --> 00:27:00,640
Auto scanning of forms if you use the form analytics plugin is something you should be
278
00:27:00,640 --> 00:27:04,480
really careful with that can make your database grow.
279
00:27:04,480 --> 00:27:08,560
So keep an eye on how long you store your raw data.
280
00:27:08,560 --> 00:27:12,440
This is normally something you should know when you start but sometimes people don't
281
00:27:12,440 --> 00:27:20,360
have a clue what that is because that means the database is growing a lot.
282
00:27:20,360 --> 00:27:30,160
And also be a bit careful with plugins because plugins will consume a lot of memory and if
283
00:27:30,160 --> 00:27:35,520
you don't need them don't install them.
284
00:27:35,520 --> 00:27:41,220
Yes MySQL, yeah this is actually something we learned.
285
00:27:41,220 --> 00:27:48,200
We are using MariaDB and what we learned is that there are actually features in Matomo
286
00:27:48,200 --> 00:27:54,800
that use functions in MySQL that are not available in MariaDB.
287
00:27:54,800 --> 00:27:59,120
So if you're having a large database you might run into problems.
288
00:27:59,120 --> 00:28:06,960
If you want to know the details I think Jorge, my colleague linked to those issues at Matomo
289
00:28:06,960 --> 00:28:15,220
but one thing on a large database at least you need to keep an eye on the database size.
290
00:28:15,220 --> 00:28:22,100
You can use tools like Tuning Primary or MySQL Tuner to get the insights and one thing we
291
00:28:22,100 --> 00:28:28,680
also learned is that over time the number of tables in Matomo will grow.
292
00:28:28,680 --> 00:28:36,760
So you might want to check these configuration options or actually configuration variables
293
00:28:36,760 --> 00:28:44,560
in MySQL because if you don't adjust these according to your database you might have
294
00:28:44,560 --> 00:28:48,040
performance issues.
295
00:28:48,040 --> 00:28:55,600
There are also optimizations that you can do to make transition reports faster.
296
00:28:55,600 --> 00:29:00,600
So there is some FAQ there.
297
00:29:00,600 --> 00:29:02,520
Every Matomo installation is quite unique.
298
00:29:02,520 --> 00:29:08,280
So I wouldn't recommend you to just apply every index for databases that you find.
299
00:29:08,280 --> 00:29:13,760
You should really try them and test how they affect your installations.
300
00:29:13,760 --> 00:29:19,520
Indexes can grow your database quite a lot and you should do this with care.
301
00:29:19,520 --> 00:29:26,160
This is obviously something you can look into and test.
302
00:29:26,160 --> 00:29:31,840
Another thing that is quite common is that if you have a lot of segments and custom reports
303
00:29:31,840 --> 00:29:38,120
you might want to start looking into how to optimize the archiving that happens under
304
00:29:38,120 --> 00:29:39,120
the hood.
305
00:29:39,120 --> 00:29:45,720
One thing we've been doing is that if a client has a lot of websites normally there is only
306
00:29:45,720 --> 00:29:51,560
one or two websites that have a lot of data and a lot of segments and the other websites
307
00:29:51,560 --> 00:29:54,760
in the installation are usually small.
308
00:29:54,760 --> 00:30:00,200
So one good thing is actually to separate these into different jobs meaning one job
309
00:30:00,200 --> 00:30:04,640
that might be heavy and take a lot of time will not affect the smaller sites.
310
00:30:04,640 --> 00:30:12,560
So having two jobs you can just do this, skip ID site number one, then force ID site equals
311
00:30:12,560 --> 00:30:19,080
one on the other job and that will offload or actually give faster reports for the small
312
00:30:19,080 --> 00:30:21,000
sites.
313
00:30:21,000 --> 00:30:27,720
Another example is to separate so that you run segments separately.
314
00:30:27,720 --> 00:30:36,120
That means that if you're not applying a segment those jobs might be faster but then you might
315
00:30:36,120 --> 00:30:44,720
have other segments that might take time if someone configured for instance segments containing
316
00:30:44,720 --> 00:30:52,520
wildcards that takes a lot of time you can actually separate these into specific jobs.
317
00:30:52,520 --> 00:31:00,000
Maybe these reports will take hours to process but the other segments will respond faster.
318
00:31:00,000 --> 00:31:05,680
So that's something you can look into and start tweaking.
319
00:31:05,680 --> 00:31:14,120
Another example is that you want to start optimizing your archiving processes.
320
00:31:14,120 --> 00:31:20,960
You can actually use these flags like concurrent request per website or concurrent archivers
321
00:31:20,960 --> 00:31:30,080
and you can also start leveraging with the PHP memory that an archive can consume.
322
00:31:30,080 --> 00:31:38,320
But this is really something you should know what you're doing before trying or you can
323
00:31:38,320 --> 00:31:42,880
try it and see and learn.
324
00:31:42,880 --> 00:31:49,680
One thing that I've been doing is that I've actually created a plugin.
325
00:31:49,680 --> 00:31:55,320
To warn you a bit this is actually a plugin I wrote to learn how to write plugins.
326
00:31:55,320 --> 00:32:02,720
So it's maybe not the most beautiful code I ever written but it's been a good learning.
327
00:32:02,720 --> 00:32:10,000
But what it actually does is that it adds some reports to my Matomo installation.
328
00:32:10,000 --> 00:32:15,640
So under the hood I can go in and get a few recommendations.
329
00:32:15,640 --> 00:32:23,080
So I have these performance tips for instance that will give me some insights on how my
330
00:32:23,080 --> 00:32:34,560
database feels and I'm doing some calculations on how my MySQL database is performing and
331
00:32:34,560 --> 00:32:41,680
in this case for instance we're not really following the recommendations on how to configure
332
00:32:41,680 --> 00:32:42,680
Matomo.
333
00:32:42,680 --> 00:32:48,120
This is a database setting that we should change and I also linked to a recommendation
334
00:32:48,120 --> 00:32:49,120
for Matomo.
335
00:32:49,120 --> 00:32:54,320
So yeah this is something I've been improving over time and we actually use it in our database
336
00:32:54,320 --> 00:32:57,200
to get some feedback.
337
00:32:57,200 --> 00:33:02,440
Another thing I recently added is regarding these segments.
338
00:33:02,440 --> 00:33:03,680
I wasn't logged in.
339
00:33:03,680 --> 00:33:06,680
I will fix that.
340
00:33:06,680 --> 00:33:13,840
I can't actually because this one requires me to be on a VPN so we won't do that.
341
00:33:13,840 --> 00:33:18,920
But I actually created a list of problematic segments.
342
00:33:18,920 --> 00:33:21,640
So segment containing wildcards.
343
00:33:21,640 --> 00:33:31,680
You can go to this repository and download it and please give feedback.
344
00:33:31,680 --> 00:33:38,040
Maybe you want to help to contribute to make it to follow the coding guidelines on Matomo.
345
00:33:38,040 --> 00:33:47,040
But basically I have created some ways to easily get status variables and things like
346
00:33:47,040 --> 00:33:52,920
that.
347
00:33:52,920 --> 00:33:53,920
What else?
348
00:33:53,920 --> 00:34:04,280
We have time for questions.
349
00:34:04,280 --> 00:34:05,480
Thank you.
350
00:34:05,480 --> 00:34:11,400
We have one question which is would you be interested in publishing the DB health plugin
351
00:34:11,400 --> 00:34:12,400
on the marketplace?
352
00:34:12,400 --> 00:34:13,400
Definitely.
353
00:34:13,400 --> 00:34:15,400
But I need some help.
354
00:34:15,400 --> 00:34:19,960
Maybe Lucas can help me to contribute.
355
00:34:19,960 --> 00:34:26,400
But I would definitely be interested in doing that.
356
00:34:26,400 --> 00:34:36,640
It's already free to download so I guess if Lucas help out it would be nice.
357
00:34:36,640 --> 00:34:39,440
You might look into it.
358
00:34:39,440 --> 00:34:42,720
So if there are any other questions feel free to ask them.
359
00:34:42,720 --> 00:35:11,960
I'm just going to give a few moments for that.
360
00:35:11,960 --> 00:35:12,960
All right.
361
00:35:12,960 --> 00:35:16,360
So no new questions.
362
00:35:16,360 --> 00:35:17,360
It seems great.
363
00:35:17,360 --> 00:35:18,360
Someone's typing.
364
00:35:18,360 --> 00:35:19,360
Yeah.
365
00:35:19,360 --> 00:35:28,240
Just a few more seconds maybe.
366
00:35:28,240 --> 00:35:43,560
I mean as always the chat stays open.
367
00:35:43,560 --> 00:35:49,000
So maybe if you see a question later you can answer it then.
368
00:35:49,000 --> 00:35:56,080
But with that for now I think I want to thank you again.
369
00:35:56,080 --> 00:36:00,920
I mean you're going to have another talk I think next hour but have a great day until
370
00:36:00,920 --> 00:36:01,920
then.
371
00:36:01,920 --> 00:36:02,920
Yeah.