-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy paththesis.tex
613 lines (480 loc) · 48.6 KB
/
thesis.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
\documentclass[a4paper]{article}
\usepackage{url}
\usepackage[all]{nowidow}
\usepackage{xcolor}
\usepackage[colorlinks=true,colorlinks,
linkcolor=purple,
citecolor=purple,
urlcolor=blue,
filecolor=blue]{hyperref}
\usepackage[capitalise]{cleveref}
\usepackage[maxbibnames=99, defernumbers=true, backend=bibtex, citestyle=numeric-comp, url=true, doi=false, isbn=false, giveninits=true, sortcites=true]{biblatex}
\addbibresource{references}
\usepackage{authblk}
\title{Writing a Computer Science Thesis}
\author{Tobias Pfandzelter}
\author{Martin Grambow}
\author{Trever Schirmer}
\author{David Bermbach}
\affil{Scalable Software Systems Research Group\\Technische Universit\"at Berlin \& Einstein Center Digital Future\\Berlin, Germany\\\texttt{\{tp,mg,ts,db\}@3s.tu-berlin.de}}
\date{\today\footnote{You can find the most up-to-date version of this article here: \url{https://github.com/3s-rg/thesis-tips}}}
\begin{document}
\maketitle
\begin{abstract}
Writing a computer science thesis is a considerable challenge for students.
In this text, we give some tips and structure to write a great thesis.
We will go over the research process in general, finding a topic, writing an expos\'e, and thesis structure.
At the end, we include some tips on researching and writing.
\end{abstract}
\section{Introduction}
\label{sec:introduction}
If you are reading this, you might be about to start a computer science (or \emph{Information Systems Management}, \emph{ICT Innovation}, etc.) thesis at \emph{Technische Universit\"at Berlin}, maybe even at the \emph{Scalable Software Systems} research group.
A thesis, whether it is a bachelor's or master's thesis\footnote{See~\cite{masters-apostrophe-as} for a discussion on why it is \emph{master's} and not \emph{masters}.}, is essentially a research task that you complete yourself.
This can be daunting, but you are not alone: Since completing a thesis is necessary to complete a degree at TUB, you can learn from your previous students' experiences.
In this article, we will go over some of those experiences and lessons in order to help you complete your thesis successfully.
Specifically, we will address the following questions:
\begin{itemize}
\item What is a thesis? How is it different from a project or seminar? (\cref{sec:what})
\item How do I find a topic for my thesis? (\cref{sec:topic})
\item How do I write an expos\'e? Why should I even do such a thing? (\cref{sec:expose})
\item How do I structure working on my thesis? (\cref{sec:structure})
\end{itemize}
Additionally, we will provide tips on research (\cref{sec:research}), writing (\cref{sec:writing}), talking (\cref{sec:talk}), working with your advisor (\cref{sec:advisor}), as well as some helpful further resources (\cref{sec:resources}) before concluding in \cref{sec:conclusion}.
If you are not a computer science student, you might still find some information in this article useful -- just keep in mind that we make some assumptions that might not apply to you.
There are two things we specifically will not go into:
First, this article will not cover the regulatory aspects of writing a thesis at TUB, including time limits, how to submit the thesis, etc.
These things change all the time, and it does not make sense to just repeat information that you can find in the \emph{AllgStuPO} and your respective \emph{StuPO} anyway.
Instead, go and read the \emph{AllgStuPO} and \emph{StuPO} now if you have not yet (even if you have, it cannot hurt to do so again as a refresher).
Let us just say that you should not plagiarize (see \cref{subsec:plags}).
Second, we will not go into detail on tooling to write your thesis.
We will just say that you should use \emph{(La)TeX} and not Word.
As a computer science (or related field) student, you should not be intimidated by a command line and text editor.
Our thesis template~\cite{thesis-template} can help you get started.
\section{What is a Thesis?}
\label{sec:what}
Before we discuss \emph{how} you write your thesis, we should first understand \emph{what} a thesis actually is.
We mentioned that a thesis is research work that you complete on your own.
Obviously, that is different from \emph{courses}, where you attend a series of lectures to learn about a topic and then write an exam at the end.
It is also completely different from most \emph{projects} that you do:
In a project, you typically solve an \emph{engineering problem}, i.e., you get a task and then work in a team (or on your own) to develop software that solves that problem.
You might additionally write tests or benchmarks to evaluate your system, but you are still essentially performing engineering work.
There is some overlap between a thesis and \emph{seminars} or \emph{reading groups} that you may have attended.
These are typically closer to actual computer science research, in that you discuss existing research.
Similarly, in a thesis you tackle a \emph{research problem}.
Research problems and engineering problems have a few things in common:
Both often require you to program some software, both require you to write text, both can be near or far from real-world problems.
One of the main differences between research and engineering is that in research, your text (or \emph{manuscript}) is the focus, and it can be supported by your implementation.
Engineering usually has a system or implementation as a result which may be supported by some text, like a documentation or report.
This distinction is something you should keep in mind during the entire time you write your thesis.
For example, if you wonder whether you should spend your limited time implementing a new feature or writing some additional text, the latter is likely to be more important.
One other thing to keep in mind is what writing a thesis is actually about:
Your goal is not to advance research (although that is usually the case) or to prove how smart you are or how well you can come up with new ideas (although both help) -- the goal of a thesis is to show that you can conduct research properly, that you have the ``research maturity''.
At the core of a research problem there is usually a \emph{research question}.
This research question can have different \emph{interrogative words} that will dictate the direction of your thesis:
If it starts with \emph{Why} or \emph{What}, you usually want to investigate a certain phenomenon (e.g., \emph{Why is the gcc compiler so slow when I compile the Linux kernel?} or \emph{What is the overhead of compiling the Linux kernel in a virtual environment?}).
On the other hand, if your question starts with \emph{How}, the focus is more on developing something new (e.g., \emph{How can we decrease the compilation time of the Linux kernel?}).
Note that there can be a bit of overlap regardless.
In our example, before you start developing something new, you should measure what currently does not work, and after you measure something, it can be useful to provide some proposals on what to change to make it more performant.
\section{Finding a Topic}
\label{sec:topic}
Before we can go into how exactly a thesis is structured, we should first see how we can find a topic, i.e., how to determine your research question.
To find a research question, you should start with a research area you are interested in.
You can find your research area by considering what you learned about in lectures, projects, and seminars, or you do some initial investigations on what areas are currently of interest to the research community.
Most research topics are given out by or created with help from your advisor.
They might, e.g., have an ongoing research project that you can contribute with.
To find an advisor, consider the research groups where you have done some specialization and, e.g., have successfully completed a (mandatory-elective) module.
Check the profiles of the different members of the group and see if there is someone whose research area matches your preferences.
When you email them, it is important to make clear what your research interests are and ideally, optionally already think about an interesting topic.
If you have already identified a relevant topic or generally have an interesting topic in mind, your chances of finding a supervisor will increase because it shows that you have already dealt with this field.
\section{Writing an Expos\'e}
\label{sec:expose}
Before you can start working on your actual thesis, it is common practice to write an \emph{expos\'e}.
Think of your expos\'e as an outline to your thesis, covering the most important points:
\begin{enumerate}
\item present the research area you are working on
\item explain your research question
\item outline your approach to answer your research question
\end{enumerate}
An expos\'e in our group should typically only cover 1-1.5 A4 pages and can be written in three to five paragraphs.
Moreover, there should be only your name, student ID, and a possible thesis title written at the top.
Your expos\'e will usually not include footnotes or references as it is only meant for you, your advisor, and possibly the professor (who will be the official examiner of your thesis).
The expectation is that you already have an understanding of the research area and did some initial research (i.e., paper reading).
However, you do not need to have a comprehensive overview of all existing approaches to the research questions, as that will be part of the research process while writing your thesis.
You can derive the title from your research question, e.g., turning \emph{Why is compiling the Linux kernel so slow?} into \emph{A Performance Analysis of Linux Kernel Compilation}.
The first paragraph of your expos\'e should cover your research area and provide some context for further explanations.
The second paragraph introduces your research problem.
Here, try to be as precise as possible to avoid confusion during the work on your thesis.
Be sure to also include motivation for your question: why is it a problem, for whom is it a problem, and why is it important that someone takes the time to work on this?
The third to fifth paragraphs should focus on what you actually plan to do in the thesis, it is basically your work plan.
You can read more about what you \emph{should} do in \cref{sec:structure}, but for now keep in mind the two questions that you should answer in this paragraph:
\begin{enumerate}
\item What is the approach that you plan to take to give an answer to the research question?
\item How will you evaluate your approach?
\end{enumerate}
There will typically be a few iterations of the expos\'e before it is in an acceptable state.
The goal of this process is to let you and your advisor have a common understanding and perspective on your research problem.
This requires discussion about different perspectives, so do not worry if it takes a few days.
However, the time between iterations is entirely up to you.
What is stated in the expos\'e is not a fixed list of requirements and tasks, it is not set in stone.
If you realize during your thesis that, e.g., the evaluation method proposed in the expos\'e is not suitable after all, then you can and must choose another suitable method.
The goal of this exercise is to have a plan which carries you in the right direction, not to have a to-do list.
\section{Structuring your Thesis}
\label{sec:structure}
In this section, we would like to cover two things, namely (1) how you should structure your time and tasks when working on your thesis, (2) and how the manuscript itself should be structured.
We will cover both things together since the approach is identical -- the best way to write your thesis is starting at the beginning and finishing at the end.
Of course, that is not the entire story: During your research, you will uncover new knowledge and will have to re-write earlier parts of your manuscript, or you need to restructure some paragraphs.
But the basic idea of working from start to finish remains.
And what if we told you that you have already begun?
\subsection{The Introduction}
The very first part of any manuscript is the introduction, where you give context for your research, introduce your research question, and outline your work.
Sounds familiar?
That is because the introduction basically mirrors your expos\'e.
It is supposed to give your reader an idea of what they can expect in your thesis, and it helps to set the scene for the text.
You should adapt the text of your expos\'e a bit instead of copying it directly.
For example, you will now need to add references to your claims and to expand the reasoning in more detail.
It is also common to add a list of contributions, i.e., bullet points on what you contribute to the research area, similarly to what we have done in \cref{sec:introduction}.
\subsection{Background}
The next part of your research (and thesis manuscript) is the dreaded background.
At this point, you should have done some initial research, e.g., using \emph{Google Scholar}~\cite{gscholar}, read some papers, looked at some existing work in the field.
Now, you should double down and fully understand (or at least be aware of) the state-of-the-art in your research area.
Follow the citation trail (e.g., by reading the papers cited by the papers you have read, reading the papers cited by that, etc.) until you get back to where you started.
After a while, you will notice that reading one of those papers keeps adding fewer papers to your have-to-read pile of papers.
In the end, you will probably have read around 100-120 papers (which you will not all need to discuss and cite in your thesis).
Generally, be smart with your time and do not spend too much on one obscure paper -- for older papers, the citation count is usually a good indicator on how well a paper has aged (more on selecting the right resources in \cref{sec:research}).
As an addition, make sure to keep some papers for your related work section (see \cref{subsec:relatedwork}) which discusses alternative approaches to your solution whereas the background section provides the necessary foundations for understanding the rest of the thesis.
In a research paper, the background section normally serves two tasks:
First, it introduces the terminology and definitions you use in the rest of your paper.
For example, there may be competing definitions on what a \emph{kernel} is: the thing that runs your Linux or the thing that makes popcorn.
This is your opportunity to set the record straight, ideally using some cites.
Second, the background section lets you introduce the concepts you will use in the rest of your paper.
Every non-trivial piece of information should be mentioned here.
It can be hard to decide what is considered trivial vs.~non-trivial.
Think about who reads your paper (e.g., your advisor) and try not to give too much information as that would make your paper boring.\footnote{We will cover some further information on Dos and Don'ts of writing in \cref{sec:writing}.}
As a rule of thumb, everything that you learned in compulsory lectures can be considered common knowledge in your field, everything that will be new to most recent master's graduates is non-trivial.
A background section in a thesis also serves another, indirect task:
It shows the reader that you as a student have completely understood the research area you operate in.
Conversely, if, during your writing process, you or your advisor notice that there is something missing from this section, you should go back into the literature and try to close that gap.
\subsection{The Approach}
This section should be named according to its contents (e.g., \emph{A Markov Model Approach to Linux Kernel Development}), but the basic idea is that you introduce your idea: What solves your research question?
This section should be comparatively short, but in a research paper this part is often the most important one that everything revolves around.
This somewhat differs in a thesis: Your grade will most likely not be determined by how ``great'' your idea is, as you will have discussed it in detail before with your advisor -- remember that this is not an engineering report.
You will be graded mainly on your execution of the thesis.
Do not feel pressured to have something novel or surprising here, it is often best to stick with what you and your advisor came up with together (which -- due to that reason -- will usually already be sufficiently novel).
The idea you present here can take many forms: it can be a study design (e.g., a benchmark that you want to perform to measure overheads), a system architecture (e.g., for a new database), or an algorithm (both an entirely new algorithm to solve a hitherto unsolved problem or the application of an existing algorithm to a new problem).
Getting this section right can be difficult, especially as there can be some overlap with the next one.
Here is some advice regarding mistakes that students commonly make on their first try:
\paragraph{Do not include implementation details}
When you write a systems paper, e.g., your idea is to write a new kind of database system, do not include your implementation of this system here.
For example, for the design of your database system, it does not matter whether it was written in Java or C++ or which classes you have written.
What matters is its overall design.
Again, remember that your research question is not an engineering problem.
One exception is if you require a certain technology for your approach, e.g., you investigate \texttt{gcc} and your approach does not extend to other compilers.
\paragraph{Do not include \emph{how} you arrived at your solution}
To find an approach for your research question, you will often start with a simple one and iterate on that to improve it.
While this is a useful technique, it does not translate well to your manuscript.
What you will end up with is not one approach but multiple, which makes it hard for your reader to understand quite what is going on.
There can be some cases where you want to compare different approaches to the same research question, but that slightly changes your research question:
Instead of a \emph{How should I solve it?}, the question then is \emph{Which of the approaches A, B, and C is the best one?}.
Your approach section will then often describe how you plan to execute your comparison study, while the different approaches are in your background (if in doubt, ask your advisor).
Focusing on a single approach and presenting that clearly makes it much easier for your readers to follow along and gives your thesis a clear message.
\paragraph{Do not mix evaluation and approach}
You will evaluate your approach in your thesis (and we will get to that), but your evaluation should be in a different section.
This includes the design of your evaluation, e.g., how you want to measure that your approach is good.
\paragraph{Do include a picture}
A good picture can help a lot in getting your message across, even more so if your approach includes experiment designs or systems architecture using simple lines and boxes.
Do not overdo it, though!
\subsection{Evaluation}
After you have presented your approach, it is time to show \emph{why} it is so great with an evaluation.
An evaluation can take many forms, from a formal proof of an algorithm to a benchmark of an implementation.
Finding the best evaluation method is something to discuss with your advisor because it depends on your research question and what you have done so far.
Here are some ideas for what you can do:
\paragraph{Simulation}
A simulation is a good way to evaluate your approach efficiently in a controlled environment.
You can simply write a small simulation environment (which will often be faster than using any of the existing, full-fledged simulation frameworks) and plug in a dataset or algorithm.
You will likely need some existing approach or other baseline to compare your new approach to.
Be sure to determine some metrics that you want to measure, so you get relevant results.
\paragraph{Implementation}
An implementation of a system can be helpful as well, although it is mostly coupled with some tests or benchmarks that show that the system does what it is supposed to do or improves an existing approach.
Keep in mind that benchmarks are (usually) only meaningful when comparing to something that already exists.
Include an overview of how you implemented your system but do not go into too much detail when it is not necessary.
\paragraph{Formal Proof}
You can also include a formal proof for your approach if that is the best way to show its correctness (very unlikely in our group).
In all cases, make sure to introduce your study design explicitly at the beginning.
For simulation and benchmarking, be smart about what and how you measure:
State your expectations and try to think like an adversary by coming up with scenarios that might break your approach.
Also, when you notice that a result does not match your expectations, try to explain it and run additional experiments~\cite{mhandley-twitter}.
\subsection{Discussion}
Not every research paper includes an explicit discussion section, but having a discussion can elevate your research tremendously.
A discussion is \emph{not} an explanation of your results -- this should be part of your evaluation.
Instead, a discussion critically evaluates your approach and evaluation.
Most things we do in computer science are not perfect:
There will be edge cases, limitations, or scenarios where other approaches are better.
This section is your opportunity to acknowledge the weaknesses of your thesis and discuss why or why not they matter (and how they could be addressed by someone else extending your approach).
You might be asking yourself:
``Why would I talk about things that do not work?!''
Because if you do not, your examiner will.
You have a unique opportunity to anticipate the comments that your examiner (or a peer reviewer in case of a research paper) might make about your approach.
This shows that you are fully aware of these limitations and edge cases, and it lets you weaken the impact of those kinds of comments.
Being able to look at your own work critically is also a testament to your scientific abilities.
Remember that the goal of research is not to sell something but to advance your field of research.
\subsection{Related Work}
\label{subsec:relatedwork}
In a related work section, you discuss other work in your area that aims to solve your research problem or related research problems.
The goal of this is not to discredit these papers but rather to show how your research builds on them, where you have used them for inspiration, and how different approaches may even be combined to create an even more effective solution (remember the point about the discussion section above that every approach will have weaknesses, including yours).
Keep it nice and friendly!
If you have read a few research papers, you might be wondering why we mention this section near the end.
You will often find it in the front of the paper, e.g., after the background section.
That can make sense for some research questions, but it will often be boring for a reader to sit through pages of related work before understanding what your idea actually is.
Instead, put it at the end where the reader has a full understanding of your approach, its effects (through evaluation), and weaknesses (through your discussion).
Go about it one paper at a time.
Mention the paper explicitly (e.g., ``Doe et al.~[10] do this and that...''), and summarize the main idea behind it.
Then, explain why this is good or bad -- also in comparison to your approach.
Sometimes, multiple papers have a categorically different approach (e.g., centralized vs.~decentralized) than yours.
In that case, group those papers and explain the advantages and disadvantages using an example.
Remember that all of these approaches do actually have advantages -- otherwise, they would not exist.\footnote{Unless you iterate on them, in which case you should also explicitly mention the similarities.}
Some students find it difficult to write this section, and we sometimes read theses that only have a few sentences here, mentioning no more than two or three papers.
Often, that means that these students have not done their research properly.
If you cannot find anything that answers your research question specifically, broaden your search area.
Maybe there is a paper that asks your research question and does not provide a solution.
Maybe a similar research problem is also posed in a different field.
Possibly, some papers you mention in your background section belong here instead or as well.
A short related work section might seem to be an indicator that your thesis is especially good because it solves a problem that no one else has looked at before and that you were now able to solve with your research superpowers -- instead, it generally means that you do not quite know what you are talking about.
\subsection{Conclusion}
Finally, it is time to conclude your thesis.
Here, give a brief overview of everything you have done.
Some research papers also use this to provide an outlook on future work you plan to do -- in most cases, students do not continue to work on their thesis projects, though (it might still be interesting to hear your thoughts on this if you have some ideas which are not implementation details/features).
\subsection{Other Sections}
The sections we have given are a general guideline on how to perform your research and how to write everything down.
You can vary the overall structure of your thesis a bit if required, but you will hardly find any published work that dramatically deviates from this common structure.
Nevertheless, you will likely need some additional components in your manuscript:
\paragraph{Abstract}
An abstract (and the German equivalent ``Zusammenfassung'') is a short summary of your work.
A reader who does not yet know whether they want to read your paper should be able to gather the most important information about it from reading just the abstract.
Need an example? Every research paper starts with an abstract.
\paragraph{Bibliography}
The bibliography (or references) list all the references of your paper.
There are different reference styles you can use, but you should probably stick to what your template dictates.
If you use a non-author-year format, e.g., as in the numbered IEEE or ACM style, avoid sentences like ``[42] do XYZ.''.
Instead, use ``Doe et al. [42] do XYZ.'' as ``[42]'' is not a subject or object in your sentence but rather added meta-information.
The sentence needs to make sense without such as citation reference (exception: author-year format).
\paragraph{Table of Contents}
Unlike research papers, theses usually also provide a table of contents that make it possible to quickly jump to the section you want.
\paragraph{List of Figures, Tables, Abbreviations}
Some templates include a list of figures, tables, or abbreviations that are generated automatically.
While it is great to have one more page in your PDF, use those lists only when necessary, i.e., when you manage lots of tables or figures.
\paragraph{Acknowledgments}
If you want, you can include an acknowledgments section.
In a research paper, this is usually used to indicate conflicts of interest, funding, or people who have helped you.
If you can only think of your advisor to put here, leave this out and thank them in person instead.
As your thesis will not be published as-is, there is no reason to include it anyway.
\subsection{Final Checks}
So, you are ready to submit your thesis!
Before you send it off to the examination office, please be sure to check everything on this non-exhaustive list of things that people commonly forget:
\begin{itemize}
\item Your name, university e-mail address, and matriculation number are on the title page.
\item The title of your thesis, the name of the first examiner, and the name of the second examiner are on the title page and \emph{identical} to that on the sign-up document you have received from the examination office at the start of your thesis.\footnote{You are unsure whether you understand everything on this document correctly because you have insufficient German language skills? Ask a friend or your advisor.}
\item The version of the thesis you are planning to submit is the ultimate, final version and not some older version.
\item Almost all of our theses are submitted digitally, please ask your (first) supervisor if you want to submit a printed out version.
\item You have included a German language and English language abstract for your thesis.
\item You have included the \emph{Eidesstattliche Versicherung} including the current date, your name, and your signature.
\end{itemize}
\section{Research/Reading Advice}
\label{sec:research}
Having the right approach to research is one of the most important parts of writing a good thesis.
The reader should get the feeling that you know what you are talking about, have a full understanding of your research area, and can critically review research papers and articles before you reach a conclusion.
We cannot teach you how to conduct your research here (this is a skill you should have honed in your studies), but we want to give you some advice and guidelines.
First, we want to emphasize the result of the distinction between engineering and research problems on how you handle your research.
In an engineering documentation or other project report, you will feel inclined to \emph{defend} what you have done, e.g., you might try to find a study that shows the downsides of a decentralized approach to defend that you have come up with a centralized approach.
That makes sense when you want to sell something, but an important pillar of science is \emph{transparency}.
To clearly show what your idea can and cannot do, give a holistic overview of state-of-the-art research, including competing ideas.
In most cases, it is best to finish your research before you begin developing your idea.
Of course, it might be necessary to go looking into certain aspects a bit deeper once you have some experience developing your idea; then, however, do not be afraid to rethink aspects of your work when you learn more.
An important aspect of doing research is finding and evaluating publications -- not all papers are created equally!
Before you read a text, try to answer a few questions:
\paragraph{Is this a peer-reviewed text?}
Officially published texts, such as conference papers and journal articles, will receive a ``peer-review'', where other researchers in the field give feedback on the text and decide whether it meets certain scientific standards.
To find out whether a text has been peer-reviewed, see if it is published in a proper journal or conference proceedings.
Preprints of papers and articles are sometimes uploaded to servers such as arXiv~\cite{arxiv} before they go through peer-review to have them available earlier -- when you encounter a preprint, check who uploaded it and whether a published version exists.
If you find a published version in the \emph{ACM} or \emph{IEEE} library but cannot access it, try to find the ``institutional login'' and use you TUB account to download the paper.
Some texts, such as industry white papers, blog posts, websites, and software documentation, do not go through peer-review.
Use these texts only as additional references when absolutely required and try to find peer-reviewed alternatives.
\paragraph{Is the venue reputable?}
The peer-review process is usually part of publication in venue, i.e., a conference, magazine, or journal.
Not all venues are equally reputable:
Some conferences accept only a small fraction of submitted papers and have high standards, while other journals might publish anything as long as the authors pay enough.
If you are not familiar with a venue, look at the institutions behind it:
If it is held or sponsored by \emph{ACM}, \emph{IEEE}, or other well-known organizations, there is a good chance that certain standards are met.
For sake of completeness, \emph{Wiley}, \emph{Springer}, and \emph{Elsevier} (and a few smaller publishers) are trustworthy as well.
On the other side of the spectrum, \emph{predatory publishers} publish everything they can get their hands on if the authors pay their high fees.
Note that some journals and conferences do not work with a publisher directly, making it a bit harder to determine their quality levels.
In this case, look at \emph{who} publishes with them (more on determining trustworthiness of authors below).
Another thing you can look at are conference rankings, but this can be a bit hit-and-miss.
From the other side, anything on Beall's List~\cite{beallslist} can safely be disregarded completely.
\paragraph{Are the authors trustworthy?}
Finally, see if you find the authors trustworthy.
Are they affiliated with an institution you know and trust?
Find out if the authors have previously researched in this area.
Usually, the last listed author is the most senior researcher, e.g., the professor -- find out whether they are notable or not.
You can also check out the references in the manuscript:
Do the authors only cite their own work or do they themselves follow good scientific practices?
Finally, you might think that a paper by an industry group such as Microsoft Research will be biased, e.g., hiding information to make the company look better.
However, you can trust the peer-review process and the authors' integrity (especially at such a reputable institution).
If you answered, ``Yes'' to all three questions, does that mean you can just take everything in the text at face-value?
Well, no.
There can always be mistakes in a paper, or things that the authors cannot know when they write a text (e.g., how a technology will develop over the years).
Conversely, even if you have answered ``No'' to one of the questions above, you might still have a good paper in front of you (which is very unlikely though).
Do keep an open mind when you read any text you encounter.
\section{Writing/Presentation Advice}
\label{sec:writing}
Finally, here is a non-exhaustive list of writing tips for your thesis.
We highly recommend that you adapt the writing style of the papers you read (with the notable exception of~\cite{lamport1998part}).
If you plan to write your master's thesis or want to improve your English writing skills, look up and read the book ``The elements of style'' \cite{strunkelements}.
\paragraph{Use a spell-checker}
There is no argument against this.
Spell-checkers exist, they are free to use, they are available on any platform.
Trust them to do their job correctly, they know the English language better than you do.
Depending on the platform, we recommend \emph{LTeX}~\cite{ltex} and \emph{TeXtidote}~\cite{textidote}, but you may choose a different one based on your preferences.
\paragraph{\textit{Use a spell-checker}}
We are not sure if you have heard us the first time.
It is literally free!
It takes a minute to set up!
There is no justification for having your advisor exposed to poor language, it can only hurt you!
Some advisors may consider this an insult and a waste of their time.
\paragraph{Be direct in your wording}
Do not use passive wording but be direct instead.
Use present tense (or past tense if you have to, but be consistent\footnote{Please, please be consistent.}) everywhere and describe your actions clearly.
\paragraph{Be professional (i.e., don't write like we do here)}
Use professional language, this is an academic environment, after all.
Don't use short forms (``do not'' instead of ``don't'') or other colloquial language, such as ``like''.
Additionally, don't address the reader directly, as we do in this guide.
There is usually no need for this.
\paragraph{Use the Oxford comma}
Use the Oxford comma, i.e., a comma before ``and'' where applicable~\cite{oxfordcomma}.
This makes your text much more readable as it removes ambiguities.
\paragraph{Use ``We'' instead of ``I''}
This is hard to wrap your head around for the first few chapters you write, but it is common practice to use the first-person plural instead of singular when referring to the author(s), even when it is just a single person (like you, writing your thesis).
There is no reason for a reader to care whether one or more people wrote a text, so why confront them with that information in every paragraph?
Besides, no research is ever the result of the work of only a single person -- even when they are not listed as an author, your advisor or peers helped you in your thesis.
\paragraph{Avoid weasel words}
Weasel words such as ``much'', ``very'', or ``significantly'' do not help in your texts.
Instead of ``We were able to improve performance significantly.'', write ``We improved read latency by 87\%.''.
This would make for a terrible novel, but is much better for a thesis.
\paragraph{Use connectives to support your line of reasoning}
Do not make the reader guess why sentence B follows on sentence A.
Instead\footnote{``Instead'' is a good connective!}, use connective words such as ``additionally'', ``furthermore'', ``in contrast'', ``consequently'', and other connectives.
You can find a list of connectives and a more detailed explanation on how to use them in~\cite{connectives}.
\paragraph{Paraphrase, do not copy text}
When you cite a text, do not copy parts of the text, such as a definition.
Instead, paraphrase the key message of that text and integrate it into your own text.
This shows that you understand the text and are not mindlessly copy-pasting (you still need to cite it though!).
\paragraph{Do not (attempt to) plagiarize}
\label{subsec:plags}
Do not present someone else's work or ideas as your own.
Acknowledge the authors of the original work when you paraphrase an idea.
Similarly, do not copy other people's text.
Referencing other work shows that you have taken the time to read and understand related work, which increases the trustworthiness of your own work.
While there are some best practices on how to cite best (you will quickly learn them when reading papers), there are no strict rules where to put your citation.
Think about readers who are not you: To them, it should be crystal clear (!), without any doubt, glaringly obvious which (i) text came from someone else (a direct quote) and (ii) which thoughts came from someone else (an indirect quote).
If in doubt ask your advisor.
Note that any plagiarization attempt will be reported as such.
\paragraph{Please use a spell-checker}
Please.
Pretty please.
\paragraph{Presenting Concepts Visually} \label{sec:diagrams}
Any piece of technical or academic writing can benefit greatly from putting concepts into pictures.
As they say, ``one picture is worth a thousand words''~\cite{thousandwords}.
Most concepts can be visualized as diagrams of lines and boxes, making it easier for your reader to understand the relationships and roles of individual components.
Keep the amount of diagrams to a sensible degree and use references in your text and figure captions to explain what can be seen in those diagrams.
Do not put too many concepts in those diagrams, however: try to limit yourself to overviews and create a new figure for more detailed explanations that may be required.
There are a number of tools out there that you can use to visualize your concepts.
In our group, we mostly use MS PowerPoint for the figures in our paper.
It is easy to use, you can export figures as vector graphics, e.g., as a PDF, so they are easy on the eyes and great to view digitally, and you can even reuse them in your presentations.
Be sure to keep a consistent visual style and employ colors only when necessary to get a specific concept across.
You may also want to use diagrams from other work, such as from papers or presentations.
Instead of screenshotting those diagrams, we highly suggest recreating them with your personal visual style and in higher quality.
That helps keeping things consistent in your thesis and high-quality.
Nevertheless, you must absolutely cite those figures!
\paragraph{Presenting Data (Charts)}
Besides presenting concepts, such as a system's architecture, you will probably also need to present some data, such as results from a simulation.
Here, employing visual tools is of even greater importance.
You should use charts for the key results of your work.
Most charts show the relationship between a number of variables and a value, such as a measurement or observation.
The relationship you want to illustrate determines the kind of chart you should use.
Variables are the things that you yourself set, and they can basically be of three different types:
\begin{itemize}
\item A \emph{numerical} variable is basically a number. It has an order and covers a range of possible values. This could be the experiment time, or an increasing workload.
\item A \emph{categorical} variable is one that can have values of a specific category. Think of varying the cloud platform you test, the different providers are values of the same variable. There is no inherent order in those values.
\end{itemize}
In almost all cases, your output variable will also be numeric.
If you find yourself in the situation that your categorical variable has only one value, it is not a variable and your plot will not show anything meaningful.
You will typically want to start with a two-dimensional chart.
As it has two dimensions, you can only show one variable (the first dimension) and one output value (the second dimension).
The input should be on the horizontal X-axis and the output should be on the vertical Y-axis.
If your goal is to show the relationship between a numerical input and numerical output, you should use a \emph{scatter plot} where you draw points for your individual measurements.
If you want to show the relationship between a categorical variable and a numerical output, use a \emph{bar chart}, \emph{box plot}, or \emph{violin plot}, depending on what you want to show.
Choose your first input dimension wisely, as it is the most important.
Instinctively, many people will use time as this dimension, but this does not always make sense, especially when the effects you observe to not change over time.
Instead, please also consider plotting the \emph{empirical distribution function} of your data as it can be more meaningful to show many data points.
You can then also add additional input dimension with colors and line or point styles -- use categorical colors for categorical values and color gradients for numerical ones.
Of course, the key elements from \cref{sec:diagrams} about diagrams apply here as well:
Keep a consistent visual style, create high quality figures, use colors only when needed to get a point across, and explain everything you see in the charts in your main text and/or figure captions.
We mainly use the Python package \emph{seaborn}~\cite{seaborn} to create graphs.
In combination with the data analysis functionality of \emph{pandas}~\cite{pandas}, you can create some great charts and dig deep into your data.
You can even do this interactively in a \emph{Jupyter} notebook~\cite{jupyter}!
As a computer science (or related subject) student, you should have all the knowledge you need to learn to use those tools.
\section{Giving a Research Talk}
\label{sec:talk}
In addition to your written thesis, you will often also have the opportunity to present your work in a research talk, e.g., to the research group your advisor is part of.
Such a presentation can help you disseminate your research to a wider audience than would normally read your thesis, i.e., your advisor and examiner.
For every presentation you give, you should have a specific goal in mind and be able to have an answer to the question ``What am I trying to achieve with this talk?'' that goes beyond ``My advisor told me to do it.''
Fortunately, in the case of presenting your thesis work, the answer is straightforward:
Your goal should be to collect as much feedback from a wider audience as possible.
Usually, the thesis presentation happens before your final submission, giving you enough time to act on that feedback.
In your presentation, give a best-effort overview of what you have already done and what you are still planning to do.
Do not gloss over important details that could provoke interesting comments from your audience!
You can expect that most attendees have a high level of knowledge about your specific topic, and you should thus spend as much of your presentation as possible on your original research rather than repetitive background information.
We also recommend sticking to the standard university template for presentations, as it is less distracting than a fancy custom one.
An important part in creating your slide set is having page numbers, so that audience members can make comments about specific slides.
If you cannot fit everything you did into the presentation, prepare backup slides of other figures and charts you have, which you may use in a discussion session to go into more detail on specific issues and questions.
Although you should not go into related work in your presentation, as it is mostly unimportant to your audience, make sure to put citations on claims, figures, and images that are not your own.
In general, a great presentation can have great results for your thesis: interesting feedback from the audience and a boost to your grade.
Do not waste this opportunity, rather show up with enthusiasm and show that you have made an effort to present your interesting research work!
Anything else could be viewed as a waste of your audience's time, and the people that determine your grade are part of that audience!
You can find more information on giving research talks in a presentation by Jones~\cite{jones-talk}, but keep in mind that the goals of a talk in a technical conference and those in a university setting might be different.
\section{Working With Your Advisor}
\label{sec:advisor}
While your thesis will be examined by professors, you will mainly interface with your thesis advisor.
Theses are usually supervised by PhD students or postdocs, who conduct their own research and have possibly advised many students before you.
Nevertheless, it should be obvious that you are the sole responsible person for your thesis.
It is your task that the thesis is completed to high standards, that it meets all the formal requirements, that you hand it in on time, and that the content makes sense.
Of course your advisor will answer all your questions to the best of their ability, but especially for formal requirements they may have outdated information, and may not know your specific circumstances.
It is important that you treat your advisor as a resource for advice, not as your thesis manager.
You yourself are responsible for making a work plan and doing your research.
If you strictly ask your advisor what you should do every week and then just do that, expect a ``minimum'' grade, at best.
Instead, ask advice on your plans or discuss results with your advisor.
Make sure to listen to what they have to say and use this advice to improve your work.
This does not necessarily mean that you should implement any ideas your advisor has to a tee -- remember that your thesis is your responsibility.
When asking for advice, it is a good idea to be as specific as possible rather than simply asking for any feedback.
If you need high-level feedback on your approach, make this expectation explicit.
If you want to know how to present some experiment data, come up with some options and ask your advisor to discuss them with you.
The more specific your questions are, the better advice you can expect.
Finally, you should be respectful of your advisor's time (as in every human interaction).
Just like you are busy finishing your thesis (and working on the side or caring for others), your advisor has their own work to complete and deadlines to hit.
Be on time for your meetings and come with an agenda of questions you have or things you want to present.
Expect that your thesis may not be on the top of your advisor's mind as it is for you.
When asynchronously asking for feedback from your advisor, e.g., sending them a draft to read through until your next meeting, do not expect them to drop everything and start on this immediately, but give them sufficient time.
Despite this, do not think that your advisor has no time for you and that you are on your own -- quite the opposite.
Instead, we think that if you have realistic expectations and communicate with your advisor efficiently, they can be an excellent resource for your thesis.
\section{Further Reading}
\label{sec:resources}
We cannot cover everything in this article, so we do want to draw your attention to a few further resources.
In a blog post~\cite{leitner}, Philipp Leitner gives some valuable tips for writing a software engineering thesis.
Simon Peyton Jones of Microsoft Research shares some valuable insights into writing technical papers~\cite{jones-paper} and giving research talks~\cite{jones-talk}.
You can find additional writing tips in~\cite{patterson-writing,ernst-writing,schulzrinne-writing}.
\section{Conclusion}
\label{sec:conclusion}
Writing a computer science thesis is hard, especially when you do it the first time.
Having some research experience from seminars can help, but there are a lot of firsts for each student.
We hope that this text gives you some structure for your upcoming thesis and that the tips and hints help you along the way.
\printbibliography
\end{document}