<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>data visualization on Neurospection</title><link>https://neurospection.netlify.com/tags/data-visualization/</link><description>Recent content in data visualization on Neurospection</description><generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><language>en-us</language><copyright>`© `{year} Stefania Ashby</copyright><lastBuildDate>Mon, 09 Dec 2019 00:00:00 +0000</lastBuildDate><atom:link href="https://neurospection.netlify.com/tags/data-visualization/index.xml" rel="self" type="application/rss+xml"/><item><title>Violin Plots</title><link>https://neurospection.netlify.com/post/violin-plots/</link><pubDate>Mon, 09 Dec 2019 00:00:00 +0000</pubDate><guid>https://neurospection.netlify.com/post/violin-plots/</guid><description>
&lt;div id=&#34;intro&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Intro&lt;/h1&gt;
&lt;p&gt;Violin plots allow us to look at the distribution of our data.
But I know what you’re thinking, “Can’t I just use a density plot to do the same thing?”.
While it’s true you can use the density plot to show the same information, violin plots are better if you have multiple groups or conditions you need to plot in the same chart.&lt;/p&gt;
&lt;p&gt;Let me show you why:&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;density-plot&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Density Plot&lt;/h1&gt;
&lt;p&gt;The example data I will use here comes from a manuscript that I am currently preparing for publication. Here I have two conditions that I’m plotting: 1. Categorization accuracy for old items, 2. Categorization accuracy for new items.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;pal = wes_palette(&amp;quot;Darjeeling1&amp;quot;, 2, type = &amp;quot;discrete&amp;quot;) # Wes Anderson Palette&amp;#39;s are fun! Check them out!
## Build density plot
cat_density1 &amp;lt;- ggplot(cat_plot_data, aes(x = accuracy, fill = condition)) +
geom_density() +
scale_fill_manual(labels = cat_labels, values = pal) +
labs(title = &amp;quot;Categorization&amp;quot;,
x = &amp;quot;Accuracy&amp;quot;,
y = &amp;quot;Density&amp;quot;) +
theme(legend.position = &amp;quot;bottom&amp;quot;,
legend.title = element_blank(),
legend.justification = &amp;quot;center&amp;quot;,
plot.title = element_text(hjust = .5))
cat_density1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;../../post/2019-12-09-violin-plots_files/figure-html/density%20plot-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;You can see that when we use a density plot, we get a nice look at the distribution of the two groups. However, they are overlapping. This may not be a big deal when we only have two conditions/groups we are comparing. But imagine how much more difficult this would be to visualize our data if we had 3 or more groups.&lt;/p&gt;
&lt;p&gt;One thing we could do is use the &lt;em&gt;facet_wrap&lt;/em&gt; function to split our distributions into separate but side-by-side charts.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;cat_density2 &amp;lt;- ggplot(cat_plot_data, aes(x = accuracy, fill = condition)) +
geom_density() +
scale_fill_manual(labels = cat_labels, values = pal) +
labs(title = &amp;quot;Categorization&amp;quot;,
x = &amp;quot;Accuracy&amp;quot;,
y = &amp;quot;Density&amp;quot;) +
theme(legend.position = &amp;quot;bottom&amp;quot;,
legend.title = element_blank(),
legend.justification = &amp;quot;center&amp;quot;,
plot.title = element_text(hjust = .5)) +
facet_wrap(~condition, ncol = 1) +
theme(strip.background = element_blank(), #Remove the condition labels since we have a legend
strip.text.x = element_blank())
cat_density2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;../../post/2019-12-09-violin-plots_files/figure-html/density%20plot%202-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;This looks pretty nice! But, violin plots allow us to look at the same information but with all groups included in the same chart. No duplicate y-axis!&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;violin-plots&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;Violin Plots&lt;/h1&gt;
&lt;p&gt;We can plot the same data on a single graph like so:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;cat_v_basic &amp;lt;- ggplot(cat_plot_data, aes(x = condition, y = accuracy, fill=condition)) +
geom_violin(trim = FALSE) +
scale_fill_manual(values = pal) +
ylab(&amp;quot;Categorization Accuracy (% Correct)&amp;quot;) +
theme(legend.position = &amp;quot;none&amp;quot;,
axis.title.x = element_blank(),
axis.title.y = element_text(size = 15),
text = element_text(family = &amp;quot;Arial&amp;quot;,
size = 25)) +
scale_y_continuous(breaks = c(0,.25, .5, .75, 1)) +
scale_x_discrete(labels= cat_labels)
cat_v_basic&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;../../post/2019-12-09-violin-plots_files/figure-html/basic%20violin%20plot-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;div id=&#34;adding-dots-for-individual-differences&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Adding dots for individual differences&lt;/h2&gt;
&lt;p&gt;I can also superimpose individual dots for each subject to help visualize individual differences in the data.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;cat_l_basic &amp;lt;- ggplot(cat_plot_data, aes(x = condition, y = accuracy, fill=condition)) +
geom_violin(trim = FALSE) +
geom_dotplot(binaxis = &amp;#39;y&amp;#39;, stackdir = &amp;#39;center&amp;#39;, dotsize = .75, fill = &amp;quot;black&amp;quot;) + #added dots
scale_fill_manual(values = pal) +
ylab(&amp;quot;Categorization Accuracy (% Correct)&amp;quot;) +
theme(legend.position = &amp;quot;none&amp;quot;,
axis.title.x = element_blank(),
axis.title.y = element_text(size = 15),
text = element_text(family = &amp;quot;Arial&amp;quot;,
size = 25)) +
scale_y_continuous(breaks = c(0,.25, .5, .75, 1)) +
scale_x_discrete(labels= cat_labels)
cat_l_basic&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;../../post/2019-12-09-violin-plots_files/figure-html/less%20basic%20violin%20plot-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;adding-mean-and-reference-line&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Adding mean and reference line&lt;/h2&gt;
&lt;p&gt;Want to know the average accuracy? I can also add a marker to denote the mean for each group and a reference line to show where chance performance lies (33% for three categories). I’ll also space the dots further apart from one another so they’re no longer touching.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;pal = wes_palette(&amp;quot;Darjeeling1&amp;quot;, 2, type = &amp;quot;discrete&amp;quot;)
## Build plot
cat_v_fancy &amp;lt;- ggplot(cat_plot_data, aes(x = condition, y = accuracy, fill=condition)) +
geom_violin(trim = FALSE, scale = &amp;quot;count&amp;quot;) +
geom_dotplot(binaxis = &amp;#39;y&amp;#39;, stackdir = &amp;#39;center&amp;#39;, dotsize = .75,stackratio = 1.5, fill = &amp;quot;black&amp;quot;) +
stat_summary(fun.y = mean, geom = &amp;quot;point&amp;quot;, size = 3, shape = 23, fill = &amp;quot;Gold&amp;quot;) + #adding mean marker
scale_fill_manual(values = pal) +
labs(title = &amp;quot;Categorization&amp;quot;,
y = &amp;quot;Categorization Accuracy (% Correct)&amp;quot;) +
theme(legend.position = &amp;quot;none&amp;quot;,
plot.title = element_text(size = 20, hjust = .5),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 15),
text = element_text(family = &amp;quot;Arial&amp;quot;,
size = 25)) +
scale_y_continuous(breaks = c(0,.25, .5, .75, 1)) +
scale_x_discrete(labels= cat_labels) +
geom_hline(yintercept = .333, linetype = &amp;quot;dashed&amp;quot;, color = &amp;quot;black&amp;quot;) #added reference line
cat_v_fancy&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;../../post/2019-12-09-violin-plots_files/figure-html/fancy%20violin%20plot-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;</description></item></channel></rss>