Presenting Data and Information by Edward Tufte
- Instructor: Edward Tufte
- What: Data Visualisation workshop by Edward Tufte
- Status: Currently ongoing
"The information is the interface"
The course kicks off with a lullaby, MIDI notes charted visually. A project management chart for a lullaby by Chopin. Can hear a note better by tracking it visually.
Clutter, confusion, and overload are not inherit to information. They are failures of design.
"Making and consuming presentations is an intellectual and moral activity. It's all about your content and credibility."
Spectatorship is an intellectual and moral activity that must hold the presenter responsible and judge the content.
"Excellence in statistical graphics consists of complex ideas communicated with clarity, precision, and efficiency." - Edward Tufte, B-The Visual Display of Quantitative Information
Good data visualisations should:
- Show the data.
- Use a well-suited format and design.
- Induce viewers to consider the content rather than the methodology, design, or other trappings. Avoid 'chart junk' and content-free decoration.
- Not distort the data.
- Be information dense.
- Make large data sets understandable
- Reveal the data at different levels of detail-- from a broad overview to fine details-- at an accessible complexity of detail.
- Serve a clear purpose with a narrative quality.
- Work closely with the data and verbal descriptions of the data, using words, numbers, and drawings together.
- Use an appropriate balance, proportion, and sense of relevant scale.
"Graphical elegance is often found in simplicity of design and complexity of data."
"Graphics reveal data"
Insights and patterns that are hidden in raw data can become clear when displayed graphically. However, "statistical graphics, just like statistical calculations, are only as good as what goes into them." Put bad data in, reveal silly 'trends'.
The choice of design
Basic structures for conveying data:
- the sentence
- the table
- the graphic
A sentence, while interesting to think about as an essential unit of data communication, is not very helpful for more than one unit of information as it prevents easy comparisons within the data. That's where tables come in (hello Resources for designing data tables). But beware The perils of pie charts. "The only worse design than pie chart is several of them."
What story does the data reveal? "There are nearly always better sequences than alphabetical."
Simplicity ≠ Clarity, Information dense ≠ complex
"Clutter and confusion are failures of design, not attributes of information."
Leverage visual comparison (which humans are great at), not visual memory (which humans are not so great at). [[!Context switch]]ing undermines information exchange and happens when users are forced to remember things across views. Short-term memory is occupied with the accessing and memory rather than the analysis of data.
Information dense displays gives control of information to viewers rather than editors or designers. It requires an active rather than passive viewer. This does not mean cluttered or confusing presentations, for the quantity of detail is a separate issuefrom the difficulty of reading.
"It is not how much empty space there is, but rather how it is used. It is not how much information there is, but rather how effectively it is arranged."
In Interaction of Color, Josef Albers talks about how stripping the detail out of typography actually makes words harder to read (see [[!Serif vs sans-serif]]) and the same principle can be true for data visualisation. Simpleness is an aesthetic, not a strategy or guide to clarity, and can result in useless or uninteresting data.
"It must embody the difficult unity of inclusion rather than the easy unity of exclusion... Where simplicity cannot work, simpleness results. Blatant simplification means bland architecture. [[!Less is a bore.]]" - Robert Venturi, Complexity and Contradiction in Architecture
Tufte points out that the world we seek to understand is complex and intricate. It is only right we portray it with appropriate intricacy and detail. "God is in the details" according to Mies van der Rohe and that sums up the central thesis of Tufte in this portion. Disentangling the presentation and aesthetic of the information presented from its inherit complexity and rich content. A good design should be capable of presenting extremely complex, information dense content in a clear, understandable, and enlightening fashion.
Google maps is possibly the most seen visualisation in human history and it has layers about layers of data encoded, in a readable format.
Design for a diversity of data and viewers
Tufte starts by diving into a seemingly simple website for the national weather service and how they present forecast data.
"When you show people data, your job is not to dumb things down, it's to make everyone who looks at it smarter." One visualisation with different layers of complexity and detail can serve different users and enlighten them all without over-simplifying for one or the other.
Do whatever it takes to convey information. Don't segregate by the medium or method of communication. Words, pictures, etc. they are all information at the end of the day.
Allow people to 'edit' and choose what to look at. We as humans already do this all the time.
Charts encode information
The issue with pies and bar charts is that the data is encoded into an area or colour or both. This means viewers have to decode it before they get to the data. And that code can vary from chart to chart. Sometimes the best way to display information is simply to show it, not encode it. Start with words and numbers. This can also help support different levels of exactitude for different viewers.
Eliminate every impediment and noise. Even a colon makes a difference when millions of people extract data from a display every day.
Be efficient in your display. No one has ever wished meetings longer. For presentations to a content audience, keep your formats straightforward and the content rich. Viewers should spend their time understanding the content, not decoding the design. Conventional does not mean universal, but conventional to your audience.
Consistency is not everything
Always enquire as to the motivations behind consistency. Sometimes consistency is good, other times it is 'because you should' or a brand marketing ploy that requires much effort with little impact.
Sparklines: word-sized graphics
Graphics should have at least the resolution of type, as that is what people can read.
The most common data display is a noun with a number, with their relevant context giving meaning to a single number. Ex: glucose 6.6. Pairing this with a small sparkline (trend line chart), can encode additional information in a very small space. Think of what you see in stock listings, spreadsheet cells, etc.
Consider the task of the viewer. For clinical analysis, is the task to detect and assess deviations outside the normal limits? Emphasise that. Perhaps a band which not only helps dampen things within the normal limits, reducing noise, it highlights those outside.
"There is a tremendous amount of information, even in a single letter."
[[!Recency bias]] gives too much weight to the most recent data coming in. A.k.a panic. Providing proper context and more data (highs, lows, historical, etc) can combat this.
Use direct labels where possible, not a key. Colours can correspond to each other without the need to explain.
Science shows the data, reports uncertainty
If you know something well, you should be able to explain it in ordinary language. If you can't or rely on jargon, you probably don't fully understand it (see the [[!Feynman technique]]).
The truth depends on the correspondence between the model and the data. The danger of hiding the data is you can fit any model to data and cherrypick or oversimplify. In other words, you can lie.
- Truncation: only show a small sample of data, especially if it fits your forecast or point. You want to see more horizontal points going out because that helps provide the context of the data.
- Binning: Condense data into a single number or chart element. Un-binning reveals deeper details and context around that data.
Novel findings and fiction
Don't trust truncated and summarised data. Or the people showing it. The defence is that it would clutter things up. But that means someone thought it was cluttered. Someone edited it.
The Phillips Curve is an example of an economic model which did not hold up but seemed perfect on a limited dataset. Fresh data remodels models. "Their argument is not even wrong, it can't be tested by evidence."
Models can create imaginary thresholds and/or plateaus not present in the data. You can try hundreds of models on a dataset and cherrypick one. Follow the money, get to the original data.
Beware self congratulation, 'significantly' (there are different types of significance. Statistical? Substance? What's the data?), 'novel', n-number showing off, etc.
The human eye-brain system is incredibly powerful and processes huge amounts of data. Use it.
Increased resolution means not having one thing after another ('the deck', controlled by an editor) but being able to see more all at once (controlled by user, everyone can scan and choose differently). It's the same in presentations. The only thing worse than someone reading aloud from a bullet point list is when the bullets are revealed one by one. The rate of information transfer is dropping to zero.
The Braun corporation dishwasher loading manual
Encoded chart data leaves users searching for how to decode it rather than taking in the data. If possible, don't separate the labels from the data. The less the data is encoded, and so the less the viewer has to decode, the better. People have not come to learn codes. They've come to see the data.
Use direct labelling. Why have people memorise colours and jump between chart and a key? Especially for accessibility and colour blindness.
Analytical thinking principles
- Make comparisons (also a fundamental task of data analysis)
- Explain, not just report. Cause and effect, not just correlation.
- Investigate the credibility of the source of the data.
Analytical thinking helps us reason about the relationship between information and conclusions. It is content independent. Looking for causality, not necessarily the content. So you can make judgements (like in the case of data visualisations and presentations), even if you do not know as much about the content. Focus on the relationship between evidence and conclusion.
Even if a presentation obscures, don't assume motives. Judge based on what you have evidence on. Conspiracies and malice are greatly overrated, accidents and incompetence are greatly underrated. Focus on the content, don't assume the character.
Turn analytical thinking into analytical design
"The purpose of information display is to assist reasoning about the content."
- Show comparisons, contrasts, and differences.
- Show causality, mechanism, explanation, systematic structure.
- The world is multivariate and spacial, not flat. Show multivariate data (more than 1-2 variables).
- Completely integrate words, numbers, images, diagrams, whatever. It's all information. Don't segregate by the mode of production.
- Document the evidence. What is this? Who is involved? Who are the sponsors? What are the data sources? Show the measurement scales, point out relevant issues, demonstrate your credibility and show the data. No one is likely to look at it, but it points to integrity of material. It promotes honesty and responsibility. Beware of presenters who will not share.
- Content matters most of all. The best way to improve your presentation is to get better data. Analytical presentations should stand or fall depending on the quality, relevance, and integrity of their content.
[[!Software segregates information]]
"Analytical presentations ultimately stand or fall depending on the quality, relevance, and integrity of their content."
The world profits by segregating information by the mode of production. Apps own data and you can only access your own data in particularly apps, operating system, pages, etc. But this is detrimental to information and to us. The user doesn't get to think about their documents or data, they have to think about all these applications. Helpful for business, yes, but unnatural and not beneficial for users. Software says you cannot draw, write, read, and share in the same tool. We have gotten used to working around it.
ASKING: How do you know that?
"Truth is truth. It cannot be overruled by any speciality, or anything you do, think, or believe."
- How do I know that?
- Can I recognise when I know less than I think? Confirmation bias is strong.
- Then ask how do you know that?
- How can anyone possibly know that? Perform thought experiments. Are there any research designs that could answer this question? The claim might not even be wrong, it could be impossible to prove.
Why research on humans is way more difficult than rocket science
Nature's mathematical laws apply to every particle everywhere, forever. Truth an exactitude are always present. There are certain universal models in biological systems, but this exactitude is not present everywhere. The researchers are humans with biases and they are researching humans who can act, think, connive, etc.
Small tilts throughout the process can completely change (or create) a finding.
Smarter meetings means shorter meetings.
Cut and paste was the first thing invented on the GUI