A reader of the Peltier Tech Blog asked me about Swimmer Plots. The first chart below is taken from “Swimmer Plot: Tell a Graphical Story of Your Time to Response Data Using PROC SGPLOT (pdf)“, by Stacey Phillips, via Swimmer Plot by Sanjay Matange on the Graphically Speaking SAS blog.
The swimmer chart below is an attempt to show the responses of several patients to drug treatments. Each horizontal stripe represent a patient’s history, color coded by the stage of the patient’s disease at the onset of treatment at month zero. Different events are plotted within each patient’s stripe. The term “Swimmer Plot” comes from the resemblance to lanes in a swimming pool, where swimmers (events) must stay in their own lanes.
The paper cited above gives a detailed protocol and related code for constructing this chart in a statistical graphics package that I’ve never used. The Peltier Tech blog reader wondered if there was a way to build this chart in Excel without having to draw shapes on the chart and locating them inexactly with the mouse.
I’m never one to pass up a reasonable challenge in Excel charting, so I decided to give it a try. This is the kind of chart that can probably be adapted to a variety of uses, and it’s a great way to help people learn how to think outside their usual approaches, to push Excel beyond its supposed limits.
The Approach
One’s first thought might be to build a horizontal bar chart, then use XY scatter data series for the various symbols on the bars. The arrows would have to be shapes drawn on the chart. This mixture of bar and XY chart types causes problems when trying to synchronize axes and aligning markers with bars.
On second thought, though, this chart can be built using XY chart series exclusively. The thick horizontal bands can be made from thick-lined error bars. The arrows at the end of certain bands can also be made using error bars. The only shape needed is for the arrow in the legend.
This entire protocol is one of my longest ever, but it will be worth the ride.
Swimmer Plot Data
All of the data in the chart conforms to two axes. The horizontal (X) axis shows months, either the total duration of the patient’s history or the treatment events. The vertical (Y) axis is simply the index of the subjects (patients): subject 1 is the bottom band in the chart and its associated symbols, subject 2 is the second band and symbols, on up to subject 10 at the top of the chart. In this example, the X axis data has been color coded blue and the Y axis data orange.
This first set of data shows the endpoints of the patient histories. Patients 10 and 3 have month data (18.2 and 9.5 months) in the Stage 1 column, patients 9, 5, and 1 in the Stage 2 column, etc. I optically extracted this data from the chart above before I realized it was included in the pdf article. I’ll take responsibility for any transcription errors.
Each set of markers in the chart need X (month) and Y (subject) data. The ranges below show data for the four sets of markers and the set of arrows that will be added to the chart.
Building the Chart: The Swimmer Lanes
The first block of data is used to create the bands in the swimmer chart. Excel’s usual arrangement is to have X values in the first column of the data range and one or more columns of Y values to the right. Our data has Y values in the last column, and several columns of X values to the left. So putting this data into the chart will take a few steps.
Select the Disease Stage 1 column of data, and hold down the Ctrl key to select the unlabeled column of subject indices, then insert an XY Scatter chart. Most of the Disease Stage 1 column is blank, so only two points appear.
I’ve already stretched the chart above to its final size, and I’ve scaled the axes using their ultimate scale parameters. The X axis extends to -1 to allow room for the Durable Responder indicators.
The chart below has had the tick marks and labels removed from the vertical (Y) axis. Note that I’m leaving out chart and axis titles and other annotations to avoid distractions.
Now the next three Disease Stage series must be added. You can do this in at least two ways. My favorite is to select and copy the data, using the Ctrl key if needed to select discontiguous regions, then select the chart and use Paste Special from the Home tab. Use the settings on the screen shot below: add data as new series, values in columns, series names in first row, categories in first column.
Alternatively, right click on the chart and choose Edit Data from the popup menu or click on Edit Data on the Chart Tools > Design tab to open the Select Data Source dialog. Click the Add button, then populate the Edit Series dialog with the ranges containing the data for the new series.
Repeat until all of the Disease Stages are potted on the chart.
I’ve formatted the markers with the desired formatting for the swimmer plot lanes. Basically I picked four colors that were a bit darker than in the original chart from the cited paper, and lightened them by applying 30% transparency.
Shameless Plug
Previously I noted that the Disease Stage data for this chart was listed with several columns of X data and one column of (shared) Y data. This differs from Excel’s assumption that the data consists of one column of (shared) X data and several columns of Y data. Because of this data arrangement, the chart has to be created tediously, one series at a time.
Such tedium isn’t necessary, however. Using Peltier Tech Charts for Excel, I was able to use the Quick XY Charts feature to create the chart with all of these series in one shot.
I selected the data, clicked on the Quick Charts button on the Peltier Tech ribbon, selected Series in Columns and the X-X-Y data layout (highlighted above). The result was this XY Scatter chart, which requires much less manipulation to generate our Swimmer Plot.
The series need to be formatted: square markers with the colors and transparency described above and no lines. Also the legend should be moved to the right of the chart.
Back to the Lanes
We’ll use thick error bars as the swimmer lanes. Select one series and add error bars using the “+” icon in Excel 2013…
… or using the Chart Tools > Layout tab in Excel 2007 and 2010.
Excel adds horizontal and vertical error bars to an XY series.
Select the vertical error bars, and click the Delete key. Then select the horizontal error bars and press Ctrl+1 (numeral one) to open the format dialog or task pane. This is the Excel 2013 task pane, but the Excel 2007-2010 format dialog is essentially the same.
Select the options shown above: Minus direction only, No End Caps, and Percentage using 100%. The chart below shows Minus 100% error bars with End Caps not removed.
Format the error bar lines with the colors used for the markers, and select a thickness for the bars which will be suitable for the chart.
Hey look, bars without a bar chart.
Add and format error bars for the rest of the Disease Stage series.
We need to remove the markers from the chart without removing them from the legend. We have to trick Excel into thinking each series in the legend is formatted with markers while each point is formatted without.Click once to select a series, then again to select a marker, then click Ctrl+1 to open the formatting dialog or task pane for the single point, and choose the No Marker option. The marker from the top Stage 1 point has been removed in this chart:
Repeat; all Stage 1 markers are hidden below:
And repeat for all of the markers:
Add the Treatment Events
Several sets of XY markers are now added. You can copy the data,select the chart, and use Paste Special as described above, or you can use the Select Data Source dialog.
Here is the chart with Complete Response Start markers added to the lanes, with the markers formatted as red triangles like those in the example chart.
Now Partial Response Start points have been added.
Finally, Response Episode End has been added.
Add the Continuation Arrows
The continued Response data is displayed not as markers but as arrows, indicating that the activity extends further to the right than the lanes. This is easy to do with an XY series that shows no markers but instead has error bars with arrowheads on the ends of the bars.
First, add the data as a new series with no markers and a heavy black line. The heavy dark line will somewhat resemble the arrows in the legend.
Add error bars (use the same method as for the error bar lanes above).
Select and delete the vertical error bars as before, then format the horizontal error bars as follows: Plus direction only, No End Caps, and a Fixed Value of 1 (e.g., one month along the X axis).
Then format the error bar lines as medium thickness black lines with the appropriate arrowhead type and size. The “begin” arrow points toward the markers, the “end” arrow points away.
The result is a set of arrows extending beyond some of the swimming lanes.
Now we need to remove the black lines from the chart without removing them from the legend. In the same way we hid the unwanted markers from the chart but kept them in the legend to denote the lanes, we can select the entire series with one click, then select one individual line segment, then press Ctrl+1 and format this segment as No Line.
Here the first segment has been hidden.
Now the second segment is hidden.
A few clicks later, all traces of the line are gone.
Add Durable Responder Indicators
Finally, the Durable Responder data is added to the chart. The -0.25 X values place the markers just left of the vertical axis and the start of the lanes.
Final Adjustments
A couple small adjustments improve the look of the chart. First, the vertical axis scale of 0 to 11 leaves rather wide margins above and below the data. If the axis scale min and max are changed to 0.25 and 10.75, this margin is slightly reduced.
The -1 horizontal axis minimum is strange, but changing the horizontal axis number format to 0;;0
hides the negative value.
Finally, if the thick black line is not acceptable as a legend marker for the Continued Response arrows, you can use an arrow for the legend entry.
First, use a white line instead of the black line for the series lines, then hide each segment of this line as you hid the black line segments; this places a white line in the legend. If you had formatted the series as No Line, the line in the legend would not be there, and the markers would be squeezed right up against the legend labels, which does not leave room for a centered arrow.
Then insert an arrow from the Insert Shapes gallery on the ribbon. Format it as a heavy line with the matching arrow appearance. Drag it into the desired position in front of the Continued Response label.
Wrap Up
Charts like this swimmer plot, the related Gantt chart, and other charts were difficult to make in Excel 2003 and earlier versions of Excel. You needed to combine stacked bar chart and XY scatter chart data in the same chart. This required special effort to ;ine up the XY markers with the bars, and keep them aligned.
Excel 2007 introduced much more flexibility in formatting of lines, so you can make thick bars with any line segments, such as series connecting lines and error bars. Swimmer plots and Gantt charts are much easier to make and maintain, because markers and bars are easy to keep in place using a single set of axes.
Bob Lalonde says
Super post Jon.
Bob
Vincent (Paris) says
Thanks for this interesting post Jon, just have to say perfect.
PS : In my configuration, the last format you evoke in the text “The -1 horizontal axis minimum is strange, but changing the horizontal axis number format to 0;0; hides the negative value” was 0;;0.
Jon Peltier says
Thanks, Vincent. Slight oversight on my part, which I’ve corrected.
Josh says
Hi Jon,
What do you think about doing this with an non-continuous intervention? I.e. one that starts and stops for each participant?
Jon Peltier says
Josh –
This complicates the problem, of course, but it’s still possible. In my example above, I have a (hidden) marker at the right end of each bar, and the bar is formed by a minus error bar using 100% of the value for the error bar length. In your case, I’d need one marker for the right end of each bar segment (there may be multiple ones for each horizontal lane, and there might be bars of more than one color in each lane. I’d also need a custom error bar length for each marker, which I would store in another column in the sheet.
Hui-li says
This was really helpful Jon, thank you.
Veronica Ruiz salas says
Hi John,
Could you please give me some help to make a swimmer plot?
I m not an expert on excel
Thank you
Jon Peltier says
Veronica –
I know it’s a long protocol. At which stage did you run into trouble?
JD says
Hello Jon,
This was very helpful. Thank you! For some reason I can’t get my lines to display from smallest to largest like in your graph. Do you have any suggestions on how I can fix this?
Thank you,
JD-
Oliver says
I am not sure what I am doing wrong.
I have prepared 5 columns of my data (looking at depth of response in patients with a type of cancer) so similar to yours.
My headers are essentially 1, 2, 3, 4 for response depths then a 5th column with consecutive patient numbers 1 to 53. I can highlight column 1 and insert a scatter plot of this. If I also highlight the numbers 1-53 (which would be the y-axis), the graph appears with an additional straight line plottted on the graph.
Is there some preparation one has to do rather than just putting in numbers to excel in step 1? You have the X and Y values in different colours with headers…is this just to demonstrate what the data is showing or are you configuring it in advance?
I couldnt see paste special anywhere on the home tab either…(excel 2016).
Novice with this so any guidance much appreciated. This type of plot would be perfect for my data and this seems to be the only explanation on the web..
I also have access to GraphPad Prism and SPSS is there’s an easier way to do this!
Any assistance would be greatly appreciated!