diff --git a/10_introduction.tex b/10_introduction.tex index 785df07..2c4ebd5 100644 --- a/10_introduction.tex +++ b/10_introduction.tex @@ -21,4 +21,4 @@ To better understand how this process works, large n-body simulations over the l %\todo{find a name for this heading} To understand how exactly the water transport works, one has to find an estimation of the mass and water fractions that are retained during two-body simulations depending on the parameters of the impact. -First, I will be shortly describing the simulation setup, the important parameters and the post-processing of the results (Chapter \ref{chapter:simulations}). Next I will summarize the results of the simulations and their properties (Chapter \ref{chapter:results}). In the main section I will then be describing three different approaches to interpolate and generalize these results for arbitrary collisions (Chapter \ref{chapter:interpolations}). \ No newline at end of file +First, I will be shortly describing the simulation setup, the important parameters and the post-processing of the results (Chapter \ref{chapter:simulations}). Next I will summarize the results of the simulations and their properties (Chapter \ref{chapter:results}). In the main section I will then be describing three different approaches to interpolate and generalize these results for arbitrary collisions (Chapter \ref{chapter:interpolations}). diff --git a/20_simulations.tex b/20_simulations.tex index b4c6bbd..ebc3b14 100644 --- a/20_simulations.tex +++ b/20_simulations.tex @@ -50,7 +50,7 @@ The last two parameters are the mass fraction of the ice to the total mass of ea \section{Execution}\todo{think of a better title} -In the first simulation run for every parameter combination from Table \ref{tab:first_simulation_parameters} a separate simulation has been started. First, the parameters and other configuration options are written in a \mbox{\texttt{simulation.input}} text file. Afterwards the relaxation program described in \cite[24\psqq]{Burger2018} generates relaxed initial conditions for all 20k particles and saves their state to \texttt{impact.0000}. Finally \texttt{miluphcuda} can be executed with the following arguments to simulate starting from this initial condition for 300 timesteps which each will be saved in a \texttt{impact.XXXX} file. +In the first simulation run for every parameter combination from Table \ref{tab:first_simulation_parameters} a separate simulation has been started. First, the parameters and other configuration options are written in a \mbox{\texttt{simulation.input}} text file. Afterwards the relaxation program described in \cite[24\psqq]{Burger2018} generates relaxed initial conditions for all 20k particles and saves their state to \texttt{impact.0000}. Finally, \texttt{miluphcuda} can be executed with the following arguments to simulate starting from this initial condition for 300 timesteps which each will be saved in a \texttt{impact.XXXX} file. \begin{lstlisting}[language=bash,flexiblecolumns=false] miluphcuda -N 20000 -I rk2_adaptive -Q 1e-4 -n 300 -a 0.5 -H -t 144.0 -f impact.0000 -m material.cfg -s -g @@ -64,7 +64,6 @@ This simulation ran on the \texttt{amanki} server using a \texttt{Nvidia GTX 108 After the simulation the properties of the SPH particles needs to be analyzed. To do this, the \texttt{identify\_fragments} C program by Christoph Burger (part of the post-processing tools of \texttt{miluphCUDA}) uses a friends-of-friends algorithm to group the final particles into fragments. Afterwards \texttt{calc\_aggregates} calculates the mass of the two largest fragments together with their gravitationally bound fragments and its output is written into a simple text file (\texttt{aggregates.txt}). -% This way, the mass retention (total mass of the two largest fragments compared to total mass of projectile and trget) and the water retention can be determined for every simulation result. \section{Resimulation} \label{sec:resimulation} @@ -86,4 +85,4 @@ This way, an additional \num{553} simulations have been calculated on \texttt{Nv \end{tabular} \caption{parameter ranges for the resimulation} \label{tab:resimulation-parameters} -\end{table} \ No newline at end of file +\end{table} diff --git a/30_results.tex b/30_results.tex index f0a5bf9..2ddf458 100644 --- a/30_results.tex +++ b/30_results.tex @@ -7,7 +7,7 @@ For the large set of simulations, we can now extract the needed values. The outp \section{Correlations} \label{sec:cov} One very easy, but sometimes flawed% -\footnote{The Pearson correlation coefficient only measures linear correlations. With a value close to zero there can still be a non-linear correlation between the two dimensions. In addition the coefficient gives no information about the steepness of the correlation, only about which fraction of the values conform to it.} +\footnote{The Pearson correlation coefficient only measures linear correlations. With a value close to zero there can still be a non-linear correlation between the two dimensions. In addition, the coefficient gives no information about the steepness of the correlation, only about which fraction of the values conform to it.} way to look at the whole dataset at once is calculating the \textit{Pearson correlation coefficient} between the input parameters and the output water fraction (Figure \ref{fig:cov}). This shows the expected result that a higher collision angle (so a more hit-and-run like collision) has a higher water retention and a higher collision speed results in less water left on the two largest remaining fragments. In addition, higher masses seem to result in less water retention. The initial water fractions of the two bodies does seem to have very little influence on the result of the simulations. \begin{figure}[h] @@ -15,4 +15,4 @@ way to look at the whole dataset at once is calculating the \textit{Pearson corr \includegraphics[width=0.6\linewidth]{images/cov.pdf} \caption{The Pearson correlation coefficient visualized as a bar graph} \label{fig:cov} -\end{figure} \ No newline at end of file +\end{figure} diff --git a/41_griddata.tex b/41_griddata.tex index 89ddc0e..58bb83f 100644 --- a/41_griddata.tex +++ b/41_griddata.tex @@ -52,7 +52,7 @@ Afterwards, the closest three points can be found very quickly by checking the n -This approach has the advantage that it can be extended in more than two dimensions by replacing the triangle in the Delaunay triangulation with an n-simplex in n dimensions. The \texttt{scipy.spatial.Delaunay} python function allows to quickly calculate it thanks to the \texttt{Qhull} library\footnote{\url{http://www.qhull.org/}}. One noticeable limitation of this method is that data can't be extrapolated. Therefore the possible output is limited to the convex hull of the input parameter space (as seen in Figure \ref{fig:3dinterpolate-2}). +This approach has the advantage that it can be extended in more than two dimensions by replacing the triangle in the Delaunay triangulation with an n-simplex in n dimensions. The \texttt{scipy.spatial.Delaunay} python function allows to quickly calculate it thanks to the \texttt{Qhull} library\footnote{\url{http://www.qhull.org/}}. One noticeable limitation of this method is that data can't be extrapolated. Therefore, the possible output is limited to the convex hull of the input parameter space (as seen in Figure \ref{fig:3dinterpolate-2}). \subsection{Implementation} \label{sec:griddata-implementation} @@ -79,4 +79,4 @@ Most notable about the results of the griddata interpolation (see Figure \ref{fi \end{subfigure} \caption{Interpolation result using griddata} \label{fig:griddataresults} -\end{figure} \ No newline at end of file +\end{figure} diff --git a/42_rbf.tex b/42_rbf.tex index 8ea3d9f..7cb271c 100644 --- a/42_rbf.tex +++ b/42_rbf.tex @@ -19,7 +19,7 @@ The RBF interpolation now consists of a linear combination of $\phi(\left\|x-x_i p_j&=\sum_{i=1}^{n}\lambda_i\phi(\left\|x_j-x_i\right\|),\quad j=1,2,\dots,n \end{align} -Therefore this can be written as a linear matrix equation: +Therefore, this can be written as a linear matrix equation: \begin{align} \begin{bmatrix} diff --git a/43_nn.tex b/43_nn.tex index f70fc53..713fdfc 100644 --- a/43_nn.tex +++ b/43_nn.tex @@ -20,7 +20,7 @@ \section{Artificial Neural Networks} -Another method that is good at taking pairs of input and output values and then able to predict output values for arbitrary input sets is using \textit{Artificial neural networks} (\texttt{ANNs}). +Another method that is good at taking pairs of input and output values and then able to predict the output for arbitrary input sets is using \textit{Artificial neural networks} (\texttt{ANNs}). \subsection{Theory} @@ -128,4 +128,4 @@ The output of the Neural Network (Figure \ref{fig:nnresults}) looks quite simila \end{tabular} \caption{Prediction accuracy for the different interpolation methods} \label{tab:comparison} -\end{table} \ No newline at end of file +\end{table} diff --git a/50_conclusion.tex b/50_conclusion.tex index 390b9ae..8f2577f 100644 --- a/50_conclusion.tex +++ b/50_conclusion.tex @@ -7,7 +7,7 @@ All three methods for interpolation described above give results that follow the Of the three methods, the trained neural network has the highest mean squared error. This seems to be at least partly caused by the fact that during training of the neural network, the data is strongly generalized, causing the final network to output the \enquote{smoothest} interpolations. While this causes the errors to be higher, it might be possible that the fine structured details in the simulation output are just an artifact of the simulation setup and doesn't represent real world collisions. -Another important aspect to compare is the interpolation speed. The neural network is able to give the 100 results in about \SI{4}{\milli\second} (after loading the trained model which takes approximately one second). RBF interpolation is still reasonably fast, taking about \SI{8.5}{\second} (\SI{85}{\milli\second} per interpolation). But as \texttt{griddata} expects a grid-based parameter space, it becomes really slow when adding the resimulation data with random parameters. A single interpolation takes about \SI{35}{\second} totaling to about an hour for all 99 test cases. Using only the original dataset brings the run time down to around \SI{10}{\second}, but causes the results to be less accurate than all other methods. (first row in Table \ref{tab:comparison}) +Another important aspect to compare is the interpolation speed. The neural network is able to give the 100 results in about \SI{4}{\milli\second} (after loading the trained model which takes approximately one second). RBF interpolation is still reasonably fast, taking about \SI{8.5}{\second} (\SI{85}{\milli\second} per interpolation). But as \texttt{griddata} expects a grid-based parameter space, it becomes really slow when adding the resimulation data with random parameters. A single interpolation takes about \SI{35}{\second} totalling to about an hour for all 99 test cases. Using only the original dataset brings the run time down to around \SI{10}{\second}, but causes the results to be less accurate than all other methods. (first row in Table \ref{tab:comparison}) Interpolation using Radial Basis Functions all in all seems to be the most reliable method if there is enough input data and this input data is mostly spread randomly across the parameter space. It is easy to implement and quite fast to execute while still giving reasonable results. Neural Networks can also provide realistic output, but have lots more configurable parameters that need to be tuned to get usable results. Their main advantage would be more noticeable if the input set was by magnitudes larger. In this case only the training would take longer, while evaluating the trained model wouldn't change. diff --git a/main.tex b/main.tex index 401a1f4..be574ba 100644 --- a/main.tex +++ b/main.tex @@ -1,5 +1,4 @@ % !TeX spellcheck = en_US -% !TeX spellcheck = en_US \input{template.tex} \hypersetup{ pdftitle={Interpolated water retention after two-body collisions using Neural Networks and linear interpolation methods},