1 00:00:00,000 --> 00:00:10,240 Okay, and with this, I would like to start the next talk. 2 00:00:10,240 --> 00:00:14,800 It is once again for Thomas Persson, like I said before, but just for the recordings. 3 00:00:14,800 --> 00:00:19,560 He's a business developer at Digitalist Sweden, and he has worked with tracking digital analytics 4 00:00:19,560 --> 00:00:26,560 since 2010, and has been a contributor to open source since 2007. 5 00:00:26,560 --> 00:00:31,200 And in this talk, how to break Matomo and also some inputs on how to fix it, Thomas 6 00:00:31,200 --> 00:00:36,320 will share his experience with us with running several large Matomo instances. 7 00:00:36,320 --> 00:00:41,400 As most people know, Matomo can be pretty slow when the data grows, and Thomas will 8 00:00:41,400 --> 00:00:46,160 share some of the biggest mistakes and also talk about how they manage them. 9 00:00:46,160 --> 00:00:47,160 And thank you. 10 00:00:47,160 --> 00:00:48,160 Please take it away. 11 00:00:48,160 --> 00:00:49,160 Thank you. 12 00:00:49,160 --> 00:00:56,200 So, yeah, sharing pains is usually good learning. 13 00:00:56,200 --> 00:01:04,280 And I hope you guys can ask a lot of questions, and, yeah, we will hopefully get some interaction 14 00:01:04,280 --> 00:01:06,040 going on as well. 15 00:01:06,040 --> 00:01:12,640 So this is actually based on my learnings, or our learnings, I should say, because it's 16 00:01:12,640 --> 00:01:17,640 not only me at Digitalist working with Matomo, of course, but some things we learned during 17 00:01:17,640 --> 00:01:29,120 the last four years when we focused pretty intense on maintaining Matomo for a lot of 18 00:01:29,120 --> 00:01:30,120 clients. 19 00:01:30,120 --> 00:01:36,320 So let's start off by doing the short intro. 20 00:01:36,320 --> 00:01:43,920 I'm a generalist, but in this case, I kind of had a background in the 90s where I actually 21 00:01:43,920 --> 00:01:51,160 worked with hosting and maintaining the servers that were really physical and heavy and large 22 00:01:51,160 --> 00:01:52,160 at that time. 23 00:01:52,160 --> 00:01:59,120 So that knowledge I've kind of brought with me during the years and actually got to use 24 00:01:59,120 --> 00:02:07,120 pretty intensively while learning and managing Matomo for our large clients. 25 00:02:07,120 --> 00:02:14,280 Did you listen to my colleague Holger yesterday that talked about how we technically maintain 26 00:02:14,280 --> 00:02:20,600 very large installations of Matomo just to get an idea of the size? 27 00:02:20,600 --> 00:02:30,840 I think we have databases that go above a bit of a terabyte of data, and we collect 28 00:02:30,840 --> 00:02:37,960 up to between 50 and 100 million actions per month stored in these databases. 29 00:02:37,960 --> 00:02:44,880 So they're quite big, and we maintain them in a setup using Kubernetes where we can scale 30 00:02:44,880 --> 00:02:47,040 things up and down. 31 00:02:47,040 --> 00:02:52,480 Obviously not the most standard setup, but I will not talk so much about Kubernetes today, 32 00:02:52,480 --> 00:02:59,760 more about the practical issues and the configurations that we can share with you guys. 33 00:02:59,760 --> 00:03:07,240 So this is a scenario that perhaps many of you have seen. 34 00:03:07,240 --> 00:03:14,200 We get different types of error messages or reports that are slow in Matomo, and obviously 35 00:03:14,200 --> 00:03:19,480 our users can be pretty sad when that happens. 36 00:03:19,480 --> 00:03:26,720 If you've been working with Matomo, you probably have these experiences over time. 37 00:03:26,720 --> 00:03:34,980 So my pro tips if you want to have this scenario is I have three main principle guidelines 38 00:03:34,980 --> 00:03:38,320 that you can follow if you want to have these problems. 39 00:03:38,320 --> 00:03:44,040 The first pro tips is to give your users in Matomo a lot of permissions from start without 40 00:03:44,040 --> 00:03:46,240 training them. 41 00:03:46,240 --> 00:03:51,740 This I can guarantee you, you will run into a lot of nice issues and problems. 42 00:03:51,740 --> 00:03:58,320 My second pro tips is to write bad applications and start tracking them in Matomo, and then 43 00:03:58,320 --> 00:04:03,680 of course blame Matomo for the problems that you end up having. 44 00:04:03,680 --> 00:04:07,080 We'll look in a bit more to this later on. 45 00:04:07,080 --> 00:04:13,640 And my number three pro tips is to let your IT department host Matomo without proper Matomo 46 00:04:13,640 --> 00:04:14,640 training. 47 00:04:14,640 --> 00:04:20,520 We'll also discuss a little bit in detail what this might lead to. 48 00:04:20,520 --> 00:04:28,400 Of course, I'm a bit in a joking mode here, but I'll try to give you some examples of 49 00:04:28,400 --> 00:04:31,400 why I give you these pro tips. 50 00:04:31,400 --> 00:04:39,080 So the first one about giving Matomo users too much permissions, Matomo is a quite complex 51 00:04:39,080 --> 00:04:40,080 tool. 52 00:04:40,080 --> 00:04:47,760 Of course, setting up tracking is something that requires quite a lot of skills in web 53 00:04:47,760 --> 00:04:54,960 development and also JavaScript, and you also need to know how the product works to do this 54 00:04:54,960 --> 00:04:55,960 properly. 55 00:04:55,960 --> 00:05:06,080 Otherwise, you will have bad data in your database and the reports will be hard to understand. 56 00:05:06,080 --> 00:05:13,960 Also creating segments or custom reports has to be pretty, you have to know what you're 57 00:05:13,960 --> 00:05:18,360 doing, so to speak, because they can actually kill your performance. 58 00:05:18,360 --> 00:05:21,280 I will give you examples of this later on. 59 00:05:21,280 --> 00:05:27,200 But also doing this wrong might lead to really, really bad user experiences because you need 60 00:05:27,200 --> 00:05:32,520 to know what you are doing and what you're showing to your users. 61 00:05:32,520 --> 00:05:38,180 So you should be really careful about giving away permissions and make sure that users 62 00:05:38,180 --> 00:05:42,480 that can create segments, for instance, know what they're doing. 63 00:05:42,480 --> 00:05:48,880 The last one, or the second one, write bad applications and blame Matomo for the issues. 64 00:05:48,880 --> 00:05:56,760 So this is actually something you need to be aware of that Matomo is just a stupid system. 65 00:05:56,760 --> 00:06:01,000 Whatever you send to Matomo will be eaten by this system. 66 00:06:01,000 --> 00:06:07,320 So you can't really fix things that is broken in your application or actually can, and I 67 00:06:07,320 --> 00:06:13,720 will give you some examples of that, but you need to have the mentality of understanding 68 00:06:13,720 --> 00:06:16,760 that Matomo is just a stupid system. 69 00:06:16,760 --> 00:06:22,680 So if you send bad data into Matomo, this could be personal data or whatever, Matomo 70 00:06:22,680 --> 00:06:28,560 will just eat that and you will store it and you will have problems in different ways. 71 00:06:28,560 --> 00:06:35,920 The two most important things to manage properly are URLs and titles because these two will 72 00:06:35,920 --> 00:06:47,720 always be just stored in Matomo when we send it. 73 00:06:47,720 --> 00:06:56,960 And URLs is actually pretty important to manage in a large installation in particular. 74 00:06:56,960 --> 00:07:02,800 So the third guideline, then, let your IT department host Matomo without proper training. 75 00:07:02,800 --> 00:07:11,480 So this goes for many systems, not just Matomo, but why it fails. 76 00:07:11,480 --> 00:07:15,960 And I actually have lots of experiences with it. 77 00:07:15,960 --> 00:07:23,520 We help a lot of Matomo clients and they often approach us because they want help with implementing 78 00:07:23,520 --> 00:07:26,380 tracking or understanding the application. 79 00:07:26,380 --> 00:07:31,880 But usually I spend about a week with the IT department in the beginning just to fix 80 00:07:31,880 --> 00:07:37,920 things that hasn't been properly set up or taken care of. 81 00:07:37,920 --> 00:07:44,800 Because maintaining Matomo or any application actually requires you to know the application. 82 00:07:44,800 --> 00:07:50,200 It's not enough to just set up a server and install the software and then you're ready. 83 00:07:50,200 --> 00:07:57,240 And Matomo is really, really a tool that you need to know to maintain properly because 84 00:07:57,240 --> 00:08:01,720 it needs constant updates and testing comes with that. 85 00:08:01,720 --> 00:08:07,880 When we have a new release of Matomo, you really need to test this and know and understand 86 00:08:07,880 --> 00:08:14,000 what was actually installed and how will that affect our instance of Matomo. 87 00:08:14,000 --> 00:08:20,920 Even though we do this 24-7, we work with Matomo and learn, we still have problems when 88 00:08:20,920 --> 00:08:27,720 there are new updates because you don't always know exactly what will happen in your Matomo 89 00:08:27,720 --> 00:08:28,720 environment. 90 00:08:28,720 --> 00:08:34,640 And this is, of course, since it's a complex environment, you have plugins, you have different 91 00:08:34,640 --> 00:08:40,080 integrations and so on and every setup is unique. 92 00:08:40,080 --> 00:08:45,560 We need to have constant monitoring in place and monitoring in this case is not only about 93 00:08:45,560 --> 00:08:55,720 does Matomo respond, it can be monitoring individual reports or monitoring your database 94 00:08:55,720 --> 00:09:00,000 or monitoring for errors really connected to Matomo. 95 00:09:00,000 --> 00:09:03,120 This is also something you need to learn over time. 96 00:09:03,120 --> 00:09:06,960 Constant optimization, your database will probably grow. 97 00:09:06,960 --> 00:09:13,080 I will show you examples of that and that means you will continuously need to optimize 98 00:09:13,080 --> 00:09:21,080 and again, constantly change your configurations depending on the needs of your users. 99 00:09:21,080 --> 00:09:31,060 So again, feel free to do this internally but make sure that the people that is taking 100 00:09:31,060 --> 00:09:35,520 care of Matomo has this as their focus. 101 00:09:35,520 --> 00:09:43,440 It's a huge difference to just set up a server and install a software and maintaining an 102 00:09:43,440 --> 00:09:46,800 application properly. 103 00:09:46,800 --> 00:09:54,680 So let's look into some of the more specific problems that are related to users. 104 00:09:54,680 --> 00:10:05,600 I talked a little bit about segments and custom reports because these both affect the aggregation 105 00:10:05,600 --> 00:10:07,320 in Matomo. 106 00:10:07,320 --> 00:10:14,880 Aggregation is the jobs that happens under the hood to produce the fast reports in the 107 00:10:14,880 --> 00:10:17,120 interface. 108 00:10:17,120 --> 00:10:22,880 So if you go to the page views report, that is actually an aggregation of the raw data 109 00:10:22,880 --> 00:10:31,200 and to produce that, there are jobs running under the hood in Matomo and they will require 110 00:10:31,200 --> 00:10:32,880 more server power. 111 00:10:32,880 --> 00:10:38,120 So every time you create a new segment or a custom report, you actually add a new job 112 00:10:38,120 --> 00:10:40,880 to Matomo to perform. 113 00:10:40,880 --> 00:10:45,880 I've seen recommendations in the community and we try to have these recommendations 114 00:10:45,880 --> 00:10:51,400 as well to not set up more than a hundred segments for a site and a hundred segments 115 00:10:51,400 --> 00:10:57,880 to be honest can be quite a lot for your service to maintain as well. 116 00:10:57,880 --> 00:11:01,440 So yeah, keep that in mind. 117 00:11:01,440 --> 00:11:08,880 One thing we learned really, really the hard way is that if you create segments in Matomo 118 00:11:08,880 --> 00:11:18,200 and you use these kind of wild card questions in your definitions, they have a large impact 119 00:11:18,200 --> 00:11:25,760 on your database performance because what happens is you see the SQL query that is created. 120 00:11:25,760 --> 00:11:29,040 This is not the exact query, but it's just an example. 121 00:11:29,040 --> 00:11:33,920 The important part is the search string at the end. 122 00:11:33,920 --> 00:11:42,320 When you send a query like that, which happens if you use the contains setting is that we're 123 00:11:42,320 --> 00:11:48,920 not able to use indexes in Matomo and when you create segments, these queries goes to 124 00:11:48,920 --> 00:11:58,680 one of the largest tables in the Matomo database and indexes is a way to make queries faster 125 00:11:58,680 --> 00:12:00,520 basically in MySQL. 126 00:12:00,520 --> 00:12:04,280 But these type of queries cannot use indexes. 127 00:12:04,280 --> 00:12:10,280 So that means that they have to do a full table scan, which is kind of technical definition 128 00:12:10,280 --> 00:12:16,720 here and that will really, really kill your performance if you have a lot of data. 129 00:12:16,720 --> 00:12:23,720 We're talking about minutes to run SQL queries if you have a large database instead of maybe 130 00:12:23,720 --> 00:12:26,360 seconds if you have a proper index in place. 131 00:12:26,360 --> 00:12:32,280 So the impact here can be massive and especially if you have a lot of these segments in place 132 00:12:32,280 --> 00:12:36,000 on a large database. 133 00:12:36,000 --> 00:12:44,320 So instead, you should try to change your segments into using starts with, for instance. 134 00:12:44,320 --> 00:12:51,120 If you do that, your query will be different and MySQL will actually be able to perform 135 00:12:51,120 --> 00:12:53,760 these queries a lot faster. 136 00:12:53,760 --> 00:13:01,360 So it's quite important actually on a large data set to manage this. 137 00:13:01,360 --> 00:13:13,720 Yes, so with this in mind, make sure that only users with this knowledge can create 138 00:13:13,720 --> 00:13:14,720 segments. 139 00:13:14,720 --> 00:13:20,920 And this will also lead to that we avoid bad configurations because I see a lot of users 140 00:13:20,920 --> 00:13:26,800 that actually use segments in the wrong way. 141 00:13:26,800 --> 00:13:31,640 I won't go into too much about that, or I can. 142 00:13:31,640 --> 00:13:37,280 The example is if you want to filter out pages for a specific section of your site, I often 143 00:13:37,280 --> 00:13:44,000 see users creating segments for this because they misunderstand how segments work. 144 00:13:44,000 --> 00:13:51,360 This is a way to create a bucket of users matching certain criteria. 145 00:13:51,360 --> 00:13:56,320 So even though you will filter out page views for that particular section, you will also 146 00:13:56,320 --> 00:14:04,640 filter out page views for other sections that the users match in your segment criteria matched. 147 00:14:04,640 --> 00:14:07,320 So yeah, different thing. 148 00:14:07,320 --> 00:14:28,880 So yeah, one thing I often see is also that users that create segments, you can do it 149 00:14:28,880 --> 00:14:31,240 on a personal level. 150 00:14:31,240 --> 00:14:33,040 So only you see the segment. 151 00:14:33,040 --> 00:14:38,080 As an administrator and particularly on sites with a lot of users, I can see that sometimes 152 00:14:38,080 --> 00:14:45,640 there are up to ten segments that are almost the same thing because users create them under 153 00:14:45,640 --> 00:14:46,640 the hood. 154 00:14:46,640 --> 00:14:53,760 And this is also bad because that means ten jobs are running under the hood when you might 155 00:14:53,760 --> 00:14:57,480 only have needed one of them. 156 00:14:57,480 --> 00:15:00,880 So yeah, keep that in mind. 157 00:15:00,880 --> 00:15:07,160 And of course, this leads to avoiding performance issues. 158 00:15:07,160 --> 00:15:09,800 Also this is a nice one. 159 00:15:09,800 --> 00:15:21,160 Look into the global ini.php file on your server and make sure you can actually set 160 00:15:21,160 --> 00:15:25,100 this is the user role needed to add segments in Matomo. 161 00:15:25,100 --> 00:15:30,400 So I think that the default one is view here and that means that basically any user in 162 00:15:30,400 --> 00:15:33,600 your Matomo instance can create segments. 163 00:15:33,600 --> 00:15:43,220 So make sure to make it admin or super user or write or whatever your organization matches. 164 00:15:43,220 --> 00:15:46,520 So that's the good thing to look into. 165 00:15:46,520 --> 00:15:47,520 All right. 166 00:15:47,520 --> 00:15:51,240 So what about bad applications? 167 00:15:51,240 --> 00:15:59,320 Well, one thing that is really important for performance is to keep the number of URLs 168 00:15:59,320 --> 00:16:00,320 down. 169 00:16:00,320 --> 00:16:08,920 And the thing here is that if you, for instance, send, Matomo can't differ between uppercase 170 00:16:08,920 --> 00:16:09,920 and lowercase. 171 00:16:09,920 --> 00:16:16,840 So if you have a page that will report sometimes with an uppercase and sometimes with a lowercase, 172 00:16:16,840 --> 00:16:20,920 Matomo will handle that as two different URLs. 173 00:16:20,920 --> 00:16:26,840 So make sure to always send lowercase URLs. 174 00:16:26,840 --> 00:16:33,160 Make sure to not send URL parameters if you can avoid that because that will also create 175 00:16:33,160 --> 00:16:42,820 unique instances of URLs, meaning the jobs in Matomo needs to process every URL unique. 176 00:16:42,820 --> 00:16:48,160 Also some CMSs sometimes send trailing slashes. 177 00:16:48,160 --> 00:16:55,840 The example here is MyPage and that will not be the same as MyPage with an ending slash. 178 00:16:55,840 --> 00:17:01,160 Another common thing that you might want to avoid is, for instance, if you have an issue 179 00:17:01,160 --> 00:17:09,600 number with unique URLs containing numbers, you might want to look into anonymize this 180 00:17:09,600 --> 00:17:16,980 so that you don't store unnecessarily page views. 181 00:17:16,980 --> 00:17:23,900 So yeah, what I usually do is that I create, actually we show you a script that can do 182 00:17:23,900 --> 00:17:31,600 this later on, but sometimes you have an issue URL and you can end up with thousands of unique 183 00:17:31,600 --> 00:17:37,800 page views or unique URLs, but no one will never look at the unique page view for that. 184 00:17:37,800 --> 00:17:47,800 So anonymize them is actually good for performance and probably also for the analytics part to 185 00:17:47,800 --> 00:17:50,200 understand the site. 186 00:17:50,200 --> 00:17:59,200 So one way to do this, this is custom JavaScript that I've added to the site. 187 00:17:59,200 --> 00:18:00,200 It's a variable. 188 00:18:00,200 --> 00:18:06,200 So what you can do here is that you create the URL cleanup and in this example, I started 189 00:18:06,200 --> 00:18:11,680 off like I'm reading the URL from the browser and making it lowercase. 190 00:18:11,680 --> 00:18:19,040 The second one will remove trailing slashes, then I have an example of how you can anonymize 191 00:18:19,040 --> 00:18:20,740 an order ID. 192 00:18:20,740 --> 00:18:26,840 So if you have an URL saying order one, two, three, four, five, six, this will become order 193 00:18:26,840 --> 00:18:29,760 slash order page. 194 00:18:29,760 --> 00:18:35,880 And the way you use this then is you go to your page view and you use this variable in 195 00:18:35,880 --> 00:18:44,160 your custom URL and by doing so, you can actually clean your site and this is also a good way. 196 00:18:44,160 --> 00:18:49,720 If you have problematic applications, this kind of JavaScript can actually help you to 197 00:18:49,720 --> 00:18:54,560 anonymize and remove personal data if you find those kind of issues. 198 00:18:54,560 --> 00:19:01,520 So maybe you have an application that shares personal identification numbers or email addresses 199 00:19:01,520 --> 00:19:03,160 from clients or whatever. 200 00:19:03,160 --> 00:19:07,040 I've seen tons of that examples there. 201 00:19:07,040 --> 00:19:18,400 You can actually use this kind of an URL to clean things. 202 00:19:18,400 --> 00:19:25,860 Another thing that I often see and some CMS systems are really terrible here. 203 00:19:25,860 --> 00:19:29,800 So a good URL structure looks something like this. 204 00:19:29,800 --> 00:19:38,400 You have maybe URLs following your menu structure or other types of categories and you have 205 00:19:38,400 --> 00:19:41,120 a clear structure in your URLs. 206 00:19:41,120 --> 00:19:47,240 And since Matomo is a stupid system, it will just take the URLs as you send them to Matomo. 207 00:19:47,240 --> 00:19:52,760 A bad example is if your CMS will have this kind of structure. 208 00:19:52,760 --> 00:19:57,600 This is also bad if you're into SEO, of course. 209 00:19:57,600 --> 00:20:05,280 But the point here is like if you go to your page views report, you will see something 210 00:20:05,280 --> 00:20:09,240 like this if you do a nice URL structure. 211 00:20:09,240 --> 00:20:16,400 You will have aggregated numbers matching your sections. 212 00:20:16,400 --> 00:20:26,640 So this is really a nice way to or a nice feature of having nice applications. 213 00:20:26,640 --> 00:20:34,200 You can actually if you have problems, you can actually fix some problems in this one. 214 00:20:34,200 --> 00:20:43,000 So sometimes I've seen applications where we have multiple URLs for the same page, which 215 00:20:43,000 --> 00:20:44,840 is obviously bad. 216 00:20:44,840 --> 00:20:54,120 Then you can add those things into this URL cleanup JavaScript as well and maybe make 217 00:20:54,120 --> 00:21:01,560 sure that they are reporting under the same URL in your or in Matomo at least. 218 00:21:01,560 --> 00:21:04,800 There are lots of ways we can help here. 219 00:21:04,800 --> 00:21:07,320 All right. 220 00:21:07,320 --> 00:21:14,740 So let's look into optimizing Matomo from a more of a server perspective. 221 00:21:14,740 --> 00:21:22,080 So some common issues in Matomo is that our transitions report is slow if you use that 222 00:21:22,080 --> 00:21:23,080 one. 223 00:21:23,080 --> 00:21:30,120 Custom date ranges are slow because what happens if you use a custom date range is that you're 224 00:21:30,120 --> 00:21:33,640 actually querying the raw data. 225 00:21:33,640 --> 00:21:39,240 The visitor log obviously is something that can be slow. 226 00:21:39,240 --> 00:21:46,440 Sometimes you can see that today's data is not coming in as quickly as you would like 227 00:21:46,440 --> 00:21:47,680 to. 228 00:21:47,680 --> 00:21:55,640 Maybe you don't see data for today until the evening or whatever, or users get 500 errors 229 00:21:55,640 --> 00:21:58,760 or timeout errors or things like that. 230 00:21:58,760 --> 00:22:04,000 There's no general solutions here, but if you have a lot of these and this is quite 231 00:22:04,000 --> 00:22:11,560 common if your data grows, you might want to look into your server configurations. 232 00:22:11,560 --> 00:22:15,080 First of all, look into the general guidelines. 233 00:22:15,080 --> 00:22:23,400 Always start there to see if you're, are you even following the basics before you go into 234 00:22:23,400 --> 00:22:29,440 maybe my more advanced recommendations later on. 235 00:22:29,440 --> 00:22:34,880 In general, you need to understand that Matomo is a PHP application and we usually refer 236 00:22:34,880 --> 00:22:38,520 to that as a LAMP installation. 237 00:22:38,520 --> 00:22:42,440 That means we have an Apache web server. 238 00:22:42,440 --> 00:22:45,920 We have MySQL and we have PHP. 239 00:22:45,920 --> 00:22:48,840 Those are the basic main components. 240 00:22:48,840 --> 00:22:55,440 There are more and you can actually change Apache to Varnish or some other web server 241 00:22:55,440 --> 00:23:05,480 and MySQL can be MariaDB, so don't take it too much of a, yeah, it can vary. 242 00:23:05,480 --> 00:23:12,240 But when you optimize these, you need to understand what kind of needs these systems have. 243 00:23:12,240 --> 00:23:19,000 So normally Apache is a very CPU intensive system, meaning what you can optimize there 244 00:23:19,000 --> 00:23:27,520 is usually threads and sometimes a bit of a memory, but you can optimize how you use 245 00:23:27,520 --> 00:23:28,520 the CPU. 246 00:23:28,520 --> 00:23:37,480 MySQL is usually memory and disk intensive, so make sure to have fast disks and then you 247 00:23:37,480 --> 00:23:45,680 can often tweak MySQL variables and this is actually one thing when I talk to IT departments 248 00:23:45,680 --> 00:23:53,840 that has been trying to maintain Matomo, they usually install the database in kind of the 249 00:23:53,840 --> 00:24:01,240 default setup that comes with MySQL and that is probably the easiest way to make your Matomo 250 00:24:01,240 --> 00:24:09,680 server slow because if you know MySQL, you can optimize I think over 200 variables. 251 00:24:09,680 --> 00:24:18,680 You shouldn't, but you can, but MySQL application like Matomo that is very database intensive 252 00:24:18,680 --> 00:24:24,360 really, really needs a lot of optimizations and there are great tools to help you with 253 00:24:24,360 --> 00:24:25,360 that. 254 00:24:25,360 --> 00:24:35,040 PHP is of course super important and for that we usually use the memory and super important 255 00:24:35,040 --> 00:24:43,880 to cache PHP because PHP is a language that will reprocess every page every time you see 256 00:24:43,880 --> 00:24:50,480 it basically or every time a function is loaded, it's actually processed unless you use caching. 257 00:24:50,480 --> 00:24:56,360 So caching is key when it comes to PHP. 258 00:24:56,360 --> 00:25:04,000 One thing that is super nice and particularly on larger installation is to use the Q tracking 259 00:25:04,000 --> 00:25:13,840 plugin which will offload your database significantly and also give you features such as you can 260 00:25:13,840 --> 00:25:17,520 restart your Matomo installation without losing data. 261 00:25:17,520 --> 00:25:26,000 I really encourage you to look into this one if you haven't already. 262 00:25:26,000 --> 00:25:29,920 You can also configure tons of things in Matomo. 263 00:25:29,920 --> 00:25:37,920 I linked to this little file on matomo.org. 264 00:25:37,920 --> 00:25:44,040 I'm going to show you that actually. 265 00:25:44,040 --> 00:25:52,160 Let's open that file just to give you an impression but in your Matomo installation you have a 266 00:25:52,160 --> 00:25:55,520 file called global.ini.php. 267 00:25:55,520 --> 00:26:00,240 This one is actually from GitHub but just to give you an idea how much you can actually 268 00:26:00,240 --> 00:26:10,520 configure on your Matomo server or your particular Matomo instance, you can tweak a lot of things 269 00:26:10,520 --> 00:26:18,920 and this is great because it means we can adopt Matomo in tons of ways but it also means 270 00:26:18,920 --> 00:26:24,840 that you can get problems if you don't know what you're doing. 271 00:26:24,840 --> 00:26:30,960 So one thing you usually want to have control of is the number of rows that you allow in 272 00:26:30,960 --> 00:26:33,400 different reports. 273 00:26:33,400 --> 00:26:38,200 From a user perspective you normally want to tweak this upwards because I think the 274 00:26:38,200 --> 00:26:43,400 default is that you show only 500 URLs in reports. 275 00:26:43,400 --> 00:26:53,000 If you have a lot of URLs you obviously want more than 500 but setting them too high and 276 00:26:53,000 --> 00:26:56,200 that means affecting performance. 277 00:26:56,200 --> 00:27:00,640 Auto scanning of forms if you use the form analytics plugin is something you should be 278 00:27:00,640 --> 00:27:04,480 really careful with that can make your database grow. 279 00:27:04,480 --> 00:27:08,560 So keep an eye on how long you store your raw data. 280 00:27:08,560 --> 00:27:12,440 This is normally something you should know when you start but sometimes people don't 281 00:27:12,440 --> 00:27:20,360 have a clue what that is because that means the database is growing a lot. 282 00:27:20,360 --> 00:27:30,160 And also be a bit careful with plugins because plugins will consume a lot of memory and if 283 00:27:30,160 --> 00:27:35,520 you don't need them don't install them. 284 00:27:35,520 --> 00:27:41,220 Yes MySQL, yeah this is actually something we learned. 285 00:27:41,220 --> 00:27:48,200 We are using MariaDB and what we learned is that there are actually features in Matomo 286 00:27:48,200 --> 00:27:54,800 that use functions in MySQL that are not available in MariaDB. 287 00:27:54,800 --> 00:27:59,120 So if you're having a large database you might run into problems. 288 00:27:59,120 --> 00:28:06,960 If you want to know the details I think Jorge, my colleague linked to those issues at Matomo 289 00:28:06,960 --> 00:28:15,220 but one thing on a large database at least you need to keep an eye on the database size. 290 00:28:15,220 --> 00:28:22,100 You can use tools like Tuning Primary or MySQL Tuner to get the insights and one thing we 291 00:28:22,100 --> 00:28:28,680 also learned is that over time the number of tables in Matomo will grow. 292 00:28:28,680 --> 00:28:36,760 So you might want to check these configuration options or actually configuration variables 293 00:28:36,760 --> 00:28:44,560 in MySQL because if you don't adjust these according to your database you might have 294 00:28:44,560 --> 00:28:48,040 performance issues. 295 00:28:48,040 --> 00:28:55,600 There are also optimizations that you can do to make transition reports faster. 296 00:28:55,600 --> 00:29:00,600 So there is some FAQ there. 297 00:29:00,600 --> 00:29:02,520 Every Matomo installation is quite unique. 298 00:29:02,520 --> 00:29:08,280 So I wouldn't recommend you to just apply every index for databases that you find. 299 00:29:08,280 --> 00:29:13,760 You should really try them and test how they affect your installations. 300 00:29:13,760 --> 00:29:19,520 Indexes can grow your database quite a lot and you should do this with care. 301 00:29:19,520 --> 00:29:26,160 This is obviously something you can look into and test. 302 00:29:26,160 --> 00:29:31,840 Another thing that is quite common is that if you have a lot of segments and custom reports 303 00:29:31,840 --> 00:29:38,120 you might want to start looking into how to optimize the archiving that happens under 304 00:29:38,120 --> 00:29:39,120 the hood. 305 00:29:39,120 --> 00:29:45,720 One thing we've been doing is that if a client has a lot of websites normally there is only 306 00:29:45,720 --> 00:29:51,560 one or two websites that have a lot of data and a lot of segments and the other websites 307 00:29:51,560 --> 00:29:54,760 in the installation are usually small. 308 00:29:54,760 --> 00:30:00,200 So one good thing is actually to separate these into different jobs meaning one job 309 00:30:00,200 --> 00:30:04,640 that might be heavy and take a lot of time will not affect the smaller sites. 310 00:30:04,640 --> 00:30:12,560 So having two jobs you can just do this, skip ID site number one, then force ID site equals 311 00:30:12,560 --> 00:30:19,080 one on the other job and that will offload or actually give faster reports for the small 312 00:30:19,080 --> 00:30:21,000 sites. 313 00:30:21,000 --> 00:30:27,720 Another example is to separate so that you run segments separately. 314 00:30:27,720 --> 00:30:36,120 That means that if you're not applying a segment those jobs might be faster but then you might 315 00:30:36,120 --> 00:30:44,720 have other segments that might take time if someone configured for instance segments containing 316 00:30:44,720 --> 00:30:52,520 wildcards that takes a lot of time you can actually separate these into specific jobs. 317 00:30:52,520 --> 00:31:00,000 Maybe these reports will take hours to process but the other segments will respond faster. 318 00:31:00,000 --> 00:31:05,680 So that's something you can look into and start tweaking. 319 00:31:05,680 --> 00:31:14,120 Another example is that you want to start optimizing your archiving processes. 320 00:31:14,120 --> 00:31:20,960 You can actually use these flags like concurrent request per website or concurrent archivers 321 00:31:20,960 --> 00:31:30,080 and you can also start leveraging with the PHP memory that an archive can consume. 322 00:31:30,080 --> 00:31:38,320 But this is really something you should know what you're doing before trying or you can 323 00:31:38,320 --> 00:31:42,880 try it and see and learn. 324 00:31:42,880 --> 00:31:49,680 One thing that I've been doing is that I've actually created a plugin. 325 00:31:49,680 --> 00:31:55,320 To warn you a bit this is actually a plugin I wrote to learn how to write plugins. 326 00:31:55,320 --> 00:32:02,720 So it's maybe not the most beautiful code I ever written but it's been a good learning. 327 00:32:02,720 --> 00:32:10,000 But what it actually does is that it adds some reports to my Matomo installation. 328 00:32:10,000 --> 00:32:15,640 So under the hood I can go in and get a few recommendations. 329 00:32:15,640 --> 00:32:23,080 So I have these performance tips for instance that will give me some insights on how my 330 00:32:23,080 --> 00:32:34,560 database feels and I'm doing some calculations on how my MySQL database is performing and 331 00:32:34,560 --> 00:32:41,680 in this case for instance we're not really following the recommendations on how to configure 332 00:32:41,680 --> 00:32:42,680 Matomo. 333 00:32:42,680 --> 00:32:48,120 This is a database setting that we should change and I also linked to a recommendation 334 00:32:48,120 --> 00:32:49,120 for Matomo. 335 00:32:49,120 --> 00:32:54,320 So yeah this is something I've been improving over time and we actually use it in our database 336 00:32:54,320 --> 00:32:57,200 to get some feedback. 337 00:32:57,200 --> 00:33:02,440 Another thing I recently added is regarding these segments. 338 00:33:02,440 --> 00:33:03,680 I wasn't logged in. 339 00:33:03,680 --> 00:33:06,680 I will fix that. 340 00:33:06,680 --> 00:33:13,840 I can't actually because this one requires me to be on a VPN so we won't do that. 341 00:33:13,840 --> 00:33:18,920 But I actually created a list of problematic segments. 342 00:33:18,920 --> 00:33:21,640 So segment containing wildcards. 343 00:33:21,640 --> 00:33:31,680 You can go to this repository and download it and please give feedback. 344 00:33:31,680 --> 00:33:38,040 Maybe you want to help to contribute to make it to follow the coding guidelines on Matomo. 345 00:33:38,040 --> 00:33:47,040 But basically I have created some ways to easily get status variables and things like 346 00:33:47,040 --> 00:33:52,920 that. 347 00:33:52,920 --> 00:33:53,920 What else? 348 00:33:53,920 --> 00:34:04,280 We have time for questions. 349 00:34:04,280 --> 00:34:05,480 Thank you. 350 00:34:05,480 --> 00:34:11,400 We have one question which is would you be interested in publishing the DB health plugin 351 00:34:11,400 --> 00:34:12,400 on the marketplace? 352 00:34:12,400 --> 00:34:13,400 Definitely. 353 00:34:13,400 --> 00:34:15,400 But I need some help. 354 00:34:15,400 --> 00:34:19,960 Maybe Lukas can help me to contribute. 355 00:34:19,960 --> 00:34:26,400 But I would definitely be interested in doing that. 356 00:34:26,400 --> 00:34:36,640 It's already free to download so I guess if Lukas help out it would be nice. 357 00:34:36,640 --> 00:34:39,440 You might look into it. 358 00:34:39,440 --> 00:34:42,720 So if there are any other questions feel free to ask them. 359 00:34:42,720 --> 00:35:11,960 I'm just going to give a few moments for that. 360 00:35:11,960 --> 00:35:12,960 All right. 361 00:35:12,960 --> 00:35:16,360 So no new questions. 362 00:35:16,360 --> 00:35:17,360 It seems great. 363 00:35:17,360 --> 00:35:18,360 Someone's typing. 364 00:35:18,360 --> 00:35:19,360 Yeah. 365 00:35:19,360 --> 00:35:28,240 Just a few more seconds maybe. 366 00:35:28,240 --> 00:35:43,560 I mean as always the chat stays open. 367 00:35:43,560 --> 00:35:49,000 So maybe if you see a question later you can answer it then. 368 00:35:49,000 --> 00:35:56,080 But with that for now I think I want to thank you again. 369 00:35:56,080 --> 00:36:00,920 I mean you're going to have another talk I think next hour but have a great day until 370 00:36:00,920 --> 00:36:01,920 then. 371 00:36:01,920 --> 00:36:02,920 Yeah.