# Homework assignment – Basic Statistics

## Table of Contents

## Aim: Why this homework?

The aims of this homework are:

- to make you practice statistics in the context of your own research
- provide you with feedback
- and make you further learn how to best use statistics via examples from others.

## Overview of the tasks: What to do and when?

### Task 1: prepare a talk and a one-page summary

**When?**: between course day 5 (8/11) and one day after course day 9 (23/11)

**What?** Prepare a short presentation (8 mins) which you will present
on the last course day (29/11) as well as a one-page description of
your talk (see details below).

### Task 2: read two one-page summaries and prepare yourself to ask questions

**When?**: between two days after day 9 (24/11) and day 10 (29/11)

**What?** Read two one-page summaries from other students and prepare
yourself to ask at least one question about each of the two
corresponding talks. You will ask your questions on day 10 (29/11),
just after the end the talks, to kick off a plenary discussion with
the other students and the teacher. The question must relate to
statistics (see details below).

### Task 3: give a talk and ask questions

**When?**: day 10 (29/11)

**What?** As a presenter, give your talk and engage in a discussion with
the audience. As an attendee, immediately after a talk for which you
have read a one-page summary, ask a question. You should listen
carefully during the two talks for which you have read a one-page
summary. Then do your best to ask an interesting question. After any
other talk, feel welcome to contribute to the plenary discussions
that follow the first questions and answers (see more details below).

## More Details for tasks 1 and 2

### Details for preparing the presentation

**Important:**

- You should use
**data**from your own PhD project or data related to your PhD project obtained from your colleagues or supervisor.

- As an alternative to presenting a data analysis, you can present a
**thought-through sample size calculation**, power calculation or least detectable difference calculation. The calculation must relate to an experiment or a clinical trial that you are planning as part of your PhD project. This should be done in agreement with the course organizer. Please let him know ASAP,**at the latest on course day 7**(15/11), if this is your preferred choice.

**Main instructions:**

- Prepare at most 6 slides which you can
present in about 8 minutes.**realistically** - Hand in your presentation by email to
`pabl at sund dot ku dot dk`

until**November 23, 2023**. - The presentation has to include elements of each of the following
sections (except if you present a calculation for planning an
experiment)
*Research question*– including a definition of the background population.*Material/Data**Statistical methods*– can be short but should cover all the statistical methods that are used to produce the*Results*.*Results*:- At least one descriptive statistic but can be more, e.g., a table with descriptive statistics for several variables (by groups).
- At least a graph which either shows the raw data or a result of the statistical analysis, as appropriate.
- At least one conclusion sentence based on statistical inference of the data which includes a confidence interval and p-value.

- Clearly define the study population, outcome and exposures/treatments/groups you compare. For instance, please explain how and when an outcome is measured or a treatment is admninistered.
- Prepare your talk such that
**the audience should be able to follow your talk, enjoy it and learn from it.** . Do not spend more time than necessary on the biological/medical background which motivates the research question and statistical analyses. Instead, keep more time to discuss why the statistical analyses you chose seem appropriate for your research question and data, that is, for your specific context.**Keep in mind that this course is about statistics**- Figures and tables are only acceptable if they come with a proper caption (legend).
- Hand in your presentation in pdf format, if possible. A Powerpoint
or html format is also accepted if necessary, but should be
avoided, if possible. The file has to be named
`Talk2023-your-full-name.pdf`

or`.ppt`

or`.html`

. - If you cannot attend the last course day send an email to
`pabl at sund dot ku dot dk`

ASAP. This is important since you cannot pass the course in that case, unless we find a good reason and a suitable alternative to make an exception.

**Further instructions:**

- The statistical part does not have to be complex. Keep it simple, so that you are sure that you understand the methods that you apply. Choose a research question which requires a complex analysis only if you feel that you can explain everything within the time limits.
- You do not have to use all the variables of your data set and you are allowed to subset the data to simplify the problem. To simplify the analysis you may also categorize some continuous variables.
- The result section must include at least one p-value and one confidence interval. However, note that to achieve this it is sufficient to perform a single comparison of an outcome variable between two groups.
- All decimal numbers should be in English notation and suitably rounded.
- The methods section must include the units of the variables and the name of the statistical method which is used to calculate the p-value.
- For all results that are shown in your presentation you should be able to explain what they mean.
- Presenting screenshots is allowed but should be avoided, if possible.
- Do not put more material on your slides than necessary.

### Details for the one-page summary

The length should not exceed one page. The document should provide a good overview of the presentation, especially (concise) information about:

*Research question**Data**Statistical methods**Results*

Keep in mind that the better you describe your work, the more likely you will receive good feedback. Read the instructions on how to read summaries from others before writing your own one-page summary (see below). It should help you to write your own.

The document should also include:

- The tile of your talk
- Your name
- Your affiliation
- Four keywords: two about your research topic and two about the statistical methods you used

Please send the document in pdf format. The file has to be named `Summary2023-your-full-name.pdf`

.

### Details for Task 2: how to read summaries from others and prepare yourself to ask questions

You are not expected to be familiar with the research described in the two one-page summaries that you will read. You are not expected to spend a lot of time to try to become familiar with it. However, please be curious and use a few minutes to read and learn, e.g. from wikipedia, if the one-page summaries contain keywords that you do not understand.

You do not necessarily need to have a question ready to ask before you
listen to the talks on day 10 (29/11). However, reading the one-page
summaries should give you the opportunity to listen to the talks well
prepared. To best prepare yourself for listening to the talks, you are
**strongly encouraged to re-read** the course material **about the
statistical methods mentioned in the one-page summaries**. It is
important that you remember the main ideas of "When, Why and How?" to
use the specific statistical methods quoted in the talks. This will
help you to ask interesting questions about the use of the statistical
methods, in the specific context presented in the talks.

This course is about statistics and your **questions must relate to
statistics**. It can be about the following (but not necessarily):

- the
**interpretation**of the statistical results: Do they answer the research question(s) well? Why? Are the main conclusions well aligned with the statistical results? Not too strong, not too weak? Could the results be sensitive to important assumptions? How does the background knowledge help with the interpretation of the statistical results?

- the choice of the
**statistical method**: Is there a good match between the research question, the available data and the choice of the statistical method? Are you aware of any alternative methods that could have been considered, their pros and cons? Do the assumptions of the statistical methods seem sensible here? Have you considered any data transformation, e.g. log-transformation of the outcome? Why preferring an unadjusted to an adjusted analysis (or the other way around) in this context? Does a multiple testing adjustment seem useful here, why?

**prespecified vs post-hoc analyses**: Were the analyses prespecified or post-hoc here, why? Are the conclusions tuned accordingly?

- the
**presentation**of the statistical results: Could the results (tables, plots, conclusions sentences) be presented differently, to facilitate their interpretation and communication?

- the
**data**being presented: Could the data have possibly been collected in a different way (better/worse)? Are the data representative of the population of interest, for the research question? Is this of major / minor importance here, for the interpretation of the results? Do the data seem particularly challenging to "best" analyze? Could some of the pragmatic choices, to make the analysis "simple", be made differently?

**Further information:**

- Please be respectful and constructive when asking your questions.
- Do your best to ask a question that you find interesting yourself and that could likely be interesting to others too.
- Try to formulate your question clearly. The question should not only be clear to you and the presenter, but also to the rest of the audience.
- Don't be afraid of asking a "stupid" question. The experience of the course organizer is that your questions are rarely "stupid". In any case, that would not be a big problem. The teacher will contribute to the discussion and do his best to make the discussion interesting, pleasant and instructive, whatever happens.