mirror of
https://github.com/Findus23/BachelorsThesis.git
synced 20240827 19:52:12 +02:00
ton of commas
This commit is contained in:
parent
96394820f3
commit
0aedef73ce
6 changed files with 23 additions and 23 deletions

@ 1,16 +1,16 @@


% !TeX spellcheck = en_US


\chapter{Introduction}\label{introduction}




One important question for planet formation is, how water did get to the earth. The part of the protoplanetary disk closest to the sun was too hot to make it possible that water can condense on Earth during formation. And while there are theories that the region where ice is possible inside the snowline moved during Earth's formation\footcite{snowline}, the most popular theory is that water moved inwards in the solar system through collisions of waterrich protoplanets.%


One important question for planet formation is, how water got to the earth. The part of the protoplanetary disk closest to the sun was too hot to make it possible that water can condense on Earth during formation. And while there are theories that the region where ice is possible inside the snowline moved during Earth's formation\footcite{snowline}, the most popular theory is that water moved inwards in the solar system through collisions of waterrich protoplanets.%


\todo{citation needed}






\section{The perfect merging assumption}




To better understand how this process works, large nbody simulations over the lifetime of the solar systems have been conducted\footnote{for example \cite{dvorakSimulation}}. Most of these neglect the physical details of collisions when two bodies collide for simplicity and instead assume that a perfect merging occurs. So all the mass of the two progenitor bodies and especially all of their water (ice) is retained in the newly created body. Obviously this is a simplification as in real collisions perfect merging is very rare and most of the time either partial accretion or a hitandrun encounter occurs.\footcite{CollisionTypes} Therefore the amount of water retained after collisions is consistently overestimated in these simulations. Depending on the parameters like impact angle and velocity a large fraction of mass and water can be lost during collisions.\footcite{MaindlSummary}


To better understand how this process works, large nbody simulations over the lifetime of the solar systems have been conducted\footnote{for example \cite{dvorakSimulation}}. Most of these neglect the physical details of collisions when two bodies collide for simplicity and instead assume that a perfect merging occurs. So the entire mass of the two progenitor bodies and especially all of their water (ice) is retained in the newly created body. Obviously this is a simplification as in real collisions perfect merging is very rare and most of the time either partial accretion or a hitandrun encounter occurs.\footcite{CollisionTypes} Therefore, the amount of water retained after collisions is consistently overestimated in these simulations. Depending on the parameters like impact angle and velocity, a large fraction of mass and water can be lost during collisions.\footcite{MaindlSummary}




\section{Some other heading}


\todo{find a name for this heading}




To understand how the water transport works exactly one has to find an estimation of the mass and water fractions that are retained during twobody simulations depending on the parameters of the impact.


First I will be shortly describing the simulation setup, the important parameters and the postprocessing of the results (Chapter \ref{chapter:simulations}). Next I will summarize the results of the simulations and their properties (Chapter \ref{chapter:results}). In the main section I will then be describing three different approaches to interpolate and generalize these results for arbitrary collisions (Chapter \ref{chapter:interpolations}).


To understand how exactly the water transport works, one has to find an estimation of the mass and water fractions that are retained during twobody simulations depending on the parameters of the impact.


First, I will be shortly describing the simulation setup, the important parameters and the postprocessing of the results (Chapter \ref{chapter:simulations}). Next I will summarize the results of the simulations and their properties (Chapter \ref{chapter:results}). In the main section I will then be describing three different approaches to interpolate and generalize these results for arbitrary collisions (Chapter \ref{chapter:interpolations}).


@ 7,7 +7,7 @@ For a realistic model of two gravitationally colliding bodies the SPH (\textit{s




In the simulation two celestial bodies are placed far enough apart so that tidal forces can affect the collision (5 times the sum of the radii). Both objects consist of a core with the physical properties of basalt rocks and an outer mantle made of water ice. These twobodycollisions are similar to those that happen between protoplanets or the collision that created the Earth's Moon.\footcite{dvorakMoon}




To keep the simulation time short and make it possible to do many simulations with varying parameters 20k SPH particles are used and each simulation is run for 12 hours and every 144 seconds the current state is saved.


To keep the simulation time short and make it possible to do many simulations with varying parameters, 20k SPH particles are used and each simulation is run for 12 hours and every 144 seconds the current state is saved.




\section{Parameters}


\label{sec:parameters}



@ 27,11 +27,11 @@ The impact angle is defined in a way that $\alpha=\ang{0}$ corresponds to a head




\subsection{target and projectile mass}




The total masses in these simulations range from about two Ceres masses (\SI{1.88e+21}{\kilogram}) to about two earth masses (\SI{1.19e+25}{\kilogram}). In addition to the total mass $m$, the mass fraction between projectile and target $\gamma$ is defined. As the whole setup is symmetrical between the two bodies only mass fractions below and equal to one have been considered.


The total masses in these simulations range from about two Ceres masses (\SI{1.88e+21}{\kilogram}) to about two earth masses (\SI{1.19e+25}{\kilogram}). In addition to the total mass $m$, the mass fraction between projectile and target $\gamma$ is defined. As the whole setup is symmetrical between the two bodies, only mass fractions below and equal to one have been considered.




\subsection{water fraction of target and projectile}




The last two parameters are the mass fraction of the ice to the total mass of each of the bodies. To keep the numbers of parameter combinations and therefore required simulations low only \SI{10}{\percent} and \SI{20}{\percent} are simulated in the first simulation set.


The last two parameters are the mass fraction of the ice to the total mass of each of the bodies. To keep the numbers of parameter combinations and therefore required simulations low, only \SI{10}{\percent} and \SI{20}{\percent} are simulated in the first simulation set.






\begin{table}



@ 50,7 +50,7 @@ The last two parameters are the mass fraction of the ice to the total mass of ea




\section{Execution}\todo{think of a better title}




In the first simulation run for every parameter combination from Table \ref{tab:first_simulation_parameters} a separate simulation has been started. First the parameters and other configuration options are written in a \mbox{\texttt{simulation.input}} text file. Afterwards the relaxation program described in \cite[24\psqq]{Burger2018} generates relaxed initial conditions for all 20k particles and saves their state to \texttt{impact.0000}. Finally \texttt{miluphcuda} can be executed with the following arguments to simulate starting from this initial condition for 300 timesteps which each will be saved in a \texttt{impact.XXXX} file.


In the first simulation run for every parameter combination from Table \ref{tab:first_simulation_parameters} a separate simulation has been started. First, the parameters and other configuration options are written in a \mbox{\texttt{simulation.input}} text file. Afterwards the relaxation program described in \cite[24\psqq]{Burger2018} generates relaxed initial conditions for all 20k particles and saves their state to \texttt{impact.0000}. Finally \texttt{miluphcuda} can be executed with the following arguments to simulate starting from this initial condition for 300 timesteps which each will be saved in a \texttt{impact.XXXX} file.




\begin{lstlisting}[language=bash,flexiblecolumns=false]


miluphcuda N 20000 I rk2_adaptive Q 1e4 n 300 a 0.5 H t 144.0 f impact.0000 m material.cfg s g



@ 62,14 +62,14 @@ This simulation run ran on the \texttt{amanki} server using a \texttt{Nvidia GTX


\section{PostProcessing}


\label{sec:postprocessing}




After the simulation the properties of the SPH particles needs to be analyzed. To do this the \texttt{identify\_fragments} C program by Christoph Burger (part of the postprocessing tools of \texttt{miluphCUDA}) uses a friendsoffriends algorithm to group the final particles into fragments. Afterwards \texttt{calc\_aggregates} calculates the mass of the two largest fragments together with their gravitationally bound fragments and its output is written into a simple text file (\texttt{aggregates.txt}).


After the simulation the properties of the SPH particles needs to be analyzed. To do this, the \texttt{identify\_fragments} C program by Christoph Burger (part of the postprocessing tools of \texttt{miluphCUDA}) uses a friendsoffriends algorithm to group the final particles into fragments. Afterwards \texttt{calc\_aggregates} calculates the mass of the two largest fragments together with their gravitationally bound fragments and its output is written into a simple text file (\texttt{aggregates.txt}).




% This way, the mass retention (total mass of the two largest fragments compared to total mass of projectile and trget) and the water retention can be determined for every simulation result.




\section{Resimulation}


\label{sec:resimulation}




To increase the amount of available data and especially reduce the errors caused by the gridbased parameter choices (Table \ref{tab:first_simulation_parameters}) a second simulation run has been started. All source code and initial parameters have been left the same apart from the six main input parameters described above. These are set to a random value in the range listed in Table \ref{tab:resimulationparameters} apart from the initial water fractions. As they seem to have little impact on the outcome (see Section \ref{sec:cov}) they are set to \SI{15}{\percent} to simplify the parameter space.


To increase the amount of available data and especially reduce the errors caused by the gridbased parameter choices (Table \ref{tab:first_simulation_parameters}), a second simulation run has been started. All source code and initial parameters have been left the same apart from the six main input parameters described above. These are set to a random value in the range listed in Table \ref{tab:resimulationparameters} apart from the initial water fractions. As they seem to have little impact on the outcome (see Section \ref{sec:cov}), they are set to \SI{15}{\percent} to simplify the parameter space.




\begin{table}


\centering





@ 2,7 +2,7 @@


\chapter{Results}


\label{chapter:results}




For the large set of simulations we can now extract the needed values. The output of the relaxation program (\texttt{spheres\_ini\_log}) gives us the precise values for impact angle and velocity and the exact masses of all bodies. As these values differ slightly from the parameters explained in Section \ref{sec:parameters} due to the setup of the simulation, in the following steps only the precise values are considered. From the \texttt{aggregates.txt} explained in Section \ref{sec:postprocessing} the final masses and water fractions of the two largest fragments are extracted. From these the main output considered in this analysis, the water retention of the two fragments can be calculated.


For the large set of simulations, we can now extract the needed values. The output of the relaxation program (\texttt{spheres\_ini\_log}) gives us the precise values for impact angle and velocity and the exact masses of all bodies. As these values differ slightly from the parameters explained in Section \ref{sec:parameters} due to the setup of the simulation, in the following steps only the precise values are considered. From the \texttt{aggregates.txt} explained in Section \ref{sec:postprocessing} the final masses and water fractions of the two largest fragments are extracted. From these, the main output considered in this analysis, the water retention of the two fragments, can be calculated.




\section{Correlations}


\label{sec:cov}





@ 5,7 +5,7 @@




One of the easiest ways to interpolate a new value between two known values is linear interpolation. It takes the closest values and creates a linear function between them.




In one dimension linear interpolation is pretty trivial. For example, let's assume that we have 20 random points $P$ between 0 and 1 (\textcolor{Red}{\textbullet} and \textcolor{Blue}{\textbullet} in Figure \ref{fig:onediminterpolation}) and have a new point $I$ (\textcolor{Green}{\textbullet}) at $0.4$ for which we want to interpolate. Finding the two closest points \textcolor{Red}{\textbullet} above and below is trivial as there is only one dimension to compare. Now, if we have measured a value $f(P)$ for each of these points, a straight line (\textcolor{LightGreen}{\textbf{}}) between the two closest values can be drawn and an interpolated value for $f(I)$ can be found.


In one dimension, linear interpolation is pretty trivial. For example, let's assume that we have 20 random points $P$ between 0 and 1 (\textcolor{Red}{\textbullet} and \textcolor{Blue}{\textbullet} in Figure \ref{fig:onediminterpolation}) and have a new point $I$ (\textcolor{Green}{\textbullet}) at $0.4$ for which we want to interpolate. Finding the two closest points \textcolor{Red}{\textbullet} above and below is trivial as there is only one dimension to compare. Now, if we have measured a value $f(P)$ for each of these points, a straight line (\textcolor{LightGreen}{\textbf{}}) between the two closest values can be drawn and an interpolated value for $f(I)$ can be found.









@ 28,7 +28,7 @@ In one dimension linear interpolation is pretty trivial. For example, let's assu




\end{figure}




In two dimensions things get more complicated as we now have a set of points with $X$ and $Y$ coordinates (Figure \ref{fig:3dinterpolate1}). One fast way to find the closest points to the point that should be interpolated is using Delaunay triangulation. This separates the space between the points into triangles while trying to maximize their smallest angle. Afterwards the closest three points can be found very quickly by checking the nodes of the surrounding triangle (Figure \ref{fig:3dinterpolate2}). If we now again have a function $f(X,Y)$ similar to the onedimensional example (Figure \ref{fig:3dinterpolate3}), we can create a unique plain through the three points and get the interpolated value for any $X$ and $Y$ on this layer.


In two dimensions things get more complicated as we now have a set of points with $X$ and $Y$ coordinates (Figure \ref{fig:3dinterpolate1}). One fast way to find the closest points to the point that should be interpolated is using Delaunay triangulation. This separates the space between the points into triangles while trying to maximize their smallest angle. Afterwards, the closest three points can be found very quickly by checking the nodes of the surrounding triangle (Figure \ref{fig:3dinterpolate2}). If we now again have a function $f(X,Y)$ similar to the onedimensional example (Figure \ref{fig:3dinterpolate3}), we can create a unique plain through the three points and get the interpolated value for any $X$ and $Y$ on this layer.






\begin{figure}[h] % also temporary




12
43_nn.tex
12
43_nn.tex

@ 26,9 +26,9 @@ Another method that is good at taking pairs of input and output values and then




The idea behind artificial neural networks is trying to emulate the functionality of neurons by having nodes that are connected to each others. The weights $w$ of these connections are modified during the training to represent the training data and can then be used to predict new results for input values not seen in the training data.




Every neural network needs an input layer with as many nodes as input parameters and an output layer with a node for every output value. In between there can be multiple hidden layers with an arbitrary amount of nodes. (Figure \ref{fig:neuralnetworkgeneral})


Every neural network needs an input layer with as many nodes as input parameters and an output layer with a node for every output value. In between, there can be multiple hidden layers with an arbitrary amount of nodes. (Figure \ref{fig:neuralnetworkgeneral})




If we first only consider a single neuron, then on every iteration it calculates the sum over all input values multiplied with their weight $w$. Afterwards an activation function $g$ is applied to the sum $z$ to get the prediction $\hat{y}$.


If we first only consider a single neuron, then on every iteration it calculates the sum over all input values multiplied with their weight $w$. Afterwards, an activation function $g$ is applied to the sum $z$ to get the prediction $\hat{y}$.




\begin{equation}


z=\sum_{i}w_ix_i \qquad \hat{y}=g(z)



@ 42,13 +42,13 @@ After this first step (the \textit{feedforward}) is done, the weights can be mod


L(\hat{y},y)=\sum_{i}(\hat{y}_iy_i)^2


\end{equation}




To update the weights the derivative of the Loss function with respect to the weights is calculated and added to the existing weights.\todo{more details?}\footcite{NNpython}


To update the weights, the derivative of the Loss function with respect to the weights is calculated and added to the existing weights.\todo{more details?}\footcite{NNpython}




\subsection{Implementation}




As building a neural network from scratch gets complex very quickly, it is easier to use \texttt{Keras}\footnote{\url{https://keras.io}} which provides easy to use highlevel functions over the calculations provided by \texttt{TensorFlow}\footnote{\url{https://www.tensorflow.org/}}. To build our network, we only need to specify the structure of the layers, take our input and let the network train for 200 epochs (iterations of feedforward and backpropagation).




The network needs six nodes in the input layer for the input parameters and one node in the output layer for the prediction. In between are two layers with decreasing numbers of nodes as this seems to give the best results. (Figure \ref{fig:neuralnetworkgraph})


The network needs six nodes in the input layer for the input parameters and one node in the output layer for the prediction. In between, are two layers with decreasing numbers of nodes as this seems to give the best results. (Figure \ref{fig:neuralnetworkgraph})




\begin{lstlisting}[language=Python,caption=the used model as Keras code,label=lst:model]


from keras import Sequential



@ 67,7 +67,7 @@ model.fit(x, Y, epochs=200, validation_data=(x_test, Y_test))




\subsection{Training}




To find the ideal parameters to use the simulation data (excluding the data from Section \ref{sec:comparison}) is split into two groups: The complete original set of simulations and \SI{80}{\percent} of the new simulation set is used to train the neural network while the remaining \SI{20}{\percent} are used for validation. This means that after every epoch the loss function is not only calculated for the training data, but also for the separate validation data (Figure \ref{fig:loss_val}). Finally the model with the lowest loss on the validation data set was chosen (Listing \ref{lst:model}).


To find the ideal parameters to use, the simulation data (excluding the data from Section \ref{sec:comparison}) is split into two groups: The complete original set of simulations and \SI{80}{\percent} of the new simulation set is used to train the neural network while the remaining \SI{20}{\percent} are used for validation. This means that after every epoch the loss function is not only calculated for the training data, but also for the separate validation data (Figure \ref{fig:loss_val}). Finally, the model with the lowest loss on the validation data set was chosen (Listing \ref{lst:model}).






\begin{figure}[h] % also temporary



@ 90,7 +90,7 @@ To find the ideal parameters to use the simulation data (excluding the data from




\end{figure}




After the training the resulting model can be saved in a small \texttt{HDF5} file which can be used to evaluate the model very quickly (about \SI{100}{\milli\second} for \num{10000} interpolations).


After the training, the resulting model can be saved in a small \texttt{HDF5} file which can be used to evaluate the model very quickly (about \SI{100}{\milli\second} for \num{10000} interpolations).






\subsection{Results}





@ 2,12 +2,12 @@


\section{Comparison}


\label{sec:comparison}




To compare the three methods explained above and measure their accuracy an additional set of 100 simulations (with the same properties as the ones listed in Section \ref{sec:resimulation}). These results are neither used to train or select the neural network, nor are in the dataset for griddata and RBF interpolation. Therefore, we can use them to generate predictions for their parameters and compare them with the real fraction of water that remained in those simulations. By taking the mean absolute difference and the mean squared error between the predictions and the real result the accuracy of the different methods can be estimated (Table \ref{tab:comparison}). As one of these parameter sets is outside the convex hull of the training data and griddata can't extrapolate, this simulation is skipped and only the remaining 99 simulations are considered for the griddata accuracy calculation.


To compare the three methods explained above and measure their accuracy an additional set of 100 simulations (with the same properties as the ones listed in Section \ref{sec:resimulation}) was created. These results are neither used to train or select the neural network, nor are in the dataset for griddata and RBF interpolation. Therefore, we can use them to generate predictions for their parameters and compare them with the real fraction of water that remained in those simulations. By taking the mean absolute difference and the mean squared error between the predictions and the real result, the accuracy of the different methods can be estimated (Table \ref{tab:comparison}). As one of these parameter sets is outside the convex hull of the training data and griddata can't extrapolate, this simulation is skipped and only the remaining 99 simulations are considered for the griddata accuracy calculation.




Of the three methods, the trained neural network has the highest mean squared error. This seems to be at least partly caused by the fact that during training the neural network the data is generalized causing the final network to output the \enquote{smoothest} interpolations. While this causes the errors to be higher, it might be possible that the fine structured details in the simulation output is just a artifact of the simulation setup and doesn't represent real world collisions.


Of the three methods, the trained neural network has the highest mean squared error. This seems to be at least partly caused by the fact that during training the neural network, the data is generalized, causing the final network to output the \enquote{smoothest} interpolations. While this causes the errors to be higher, it might be possible that the fine structured details in the simulation output is just a artifact of the simulation setup and doesn't represent real world collisions.


\todo{better wording}




Another important aspect to compare is the interpolation speed. The neural network is able to give the 100 results in about \SI{4}{\milli\second} (after loading the trained model). RBF interpolation is still reasonably fast taking about \SI{8.5}{\second} (\SI{85}{\milli\second} per interpolation). But as \texttt{griddata} expects a gridbased parameter space, it becomes really slow when adding the resimulation data with random parameters. A single interpolation takes about \SI{35}{\second} totaling to around an hour for all 99 test cases. Using only the original dataset brings the runtime down to around \SI{10}{\second}, but causes the results to be less accurate than all other methods. (first row in Table \ref{tab:comparison})


Another important aspect to compare is the interpolation speed. The neural network is able to give the 100 results in about \SI{4}{\milli\second} (after loading the trained model). RBF interpolation is still reasonably fast, taking about \SI{8.5}{\second} (\SI{85}{\milli\second} per interpolation). But as \texttt{griddata} expects a gridbased parameter space, it becomes really slow when adding the resimulation data with random parameters. A single interpolation takes about \SI{35}{\second} totaling to around an hour for all 99 test cases. Using only the original dataset brings the runtime down to around \SI{10}{\second}, but causes the results to be less accurate than all other methods. (first row in Table \ref{tab:comparison})




\begin{table}


\centering



@ 18,6 +18,6 @@ Another important aspect to compare is the interpolation speed. The neural netwo


RBF & 0.008 & 0.057 \\


griddata & 0.005 & 0.046


\end{tabular}


\caption{prediction accuracy for the different interpolation methods}


\caption{Prediction accuracy for the different interpolation methods}


\label{tab:comparison}


\end{table}

Loading…
Reference in a new issue