A Robot Wrote This?: Readers’ Perceptions of Automated Technical Writing

Sorry, but you do not have permission to view this content.
Subscribe
Notify of
15 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jessica Campbell

Interesting study. The pairing of your two methods provides good data sets to be able to compare and contrast. How did you locate an organization that uses AW in their content strategy to interview? Is there a way to identify AW in media? Do you find there is a genre or subject that AW is better fit for?

Yingying Tang

Hi Jessica, thank you for your questions! I located the companies needed for this research and established contact with them with the help of some practitioners I met during my previous internship. I choose these two companies as their main businesses are to provide AW contents and services, which are the research focus of this study. Moreover, the two companies are typical representatives of AW startups: very young, small in size, and their clients are mostly large, international corporations that have the needs of massive content production. So I select them for this exploratory case study.
Currently, almost all the big news agencies have developed their own AW bots (e.g., New York Times’ Editor, Washington Post’s Heliograf, BBC’s Juicer, Reuters’ News Tracer, etc.) and used them to write finance and sports news, crunch data, dig out hot news, detect fake news, and moderate readers’ comments (Underwood, 2019). But most likely you won’t know or notice the content is generated by algorithms. According to Montal and Reich’s study, agencies that use algorithms usually do not provide clear author information of the automated content due to the lack of updated byline policies on AW and their “human-centered perception of authorship”. 

Montal, T., & Reich, Z. (2017). I, robot. You, journalist. Who is the author? Authorship, bylines and full disclosure in automated journalism. Digital journalism, 5(7), 829-849.

In the current business practice, AW is most capable of performing rule-based, routine writing tasks and it is good at data collection and analysis. So it is best at writing data-based content like sports/finance news, business reports, technical documents (based on existing documents), and even scientific journal articles. But with advances in NLP, it has the potential to be applied more widely in creative content creation like arts and literature. The Next Rembrandt (https://www.nextrembrandt.com/), an AI painting generated by Microsoft’s algorithm, is such an example. And such AI-generated artwork does have a huge market (https://www.bbc.com/news/technology-45980863#:~:text=An%20artwork%20created%20by%20an,New%20York%20before%20the%20sale.) and the potential for commercialization.

Jane Vaughan

I’m interested in your coding methods, I agree with Jessica that the two methods are an interesting contrast, but I always detest coding writing so I’m impressed that you took it on. It would be interesting to work on an interdisciplinary study at some point with someone from Computer Science who would break down and examine the AI from that perspective. I’m also wondering if the objective is always to approach as close to human writing as possible? Are there any situations where that might not be the best objective, or where you might want the writing to be more identifiable as “machine-like?”

Yingying Tang

Hi Jane, thank you for your comments! It’s great to learn that you are also working on this topic! This research project looks specifically at the application and end products of AW algorithms to see their impacts in business, user perception, and potential ways to improve AW algorithms. But I am also interested to learn algorithms themselves as computational programs that make ethical decisions in the content creation processes. As you said, it would be great to include the perspectives of CS to learn how these algorithms are developed and trained as well as the ethical choices programmers have made in coding, selecting and preparing training data, etc. 
You may have already read this, but if not, the study of Omizo and his interdisciplinary team could be of interest to you:

Omizo, R., Clark, I., Nguyen, M. and Hart-Davidson, W. (2019). Inventing Rhetorical Machines: On Facilitating Learning and Public Participation in Science. Jones, J., & Hirsu, L. (Ed.), Rhetorical Machines: Writing, Code, and Computational Ethics (pp.110-136). University of Alabama Press.

In this article, they discussed the invention and application of the rhetorical machine/algorithm named Faciloscope (a tool that assists the museum to lead online activities and foster public engagement and learning in science) and how it enacts ethical programs to affect human cognition.
With advances in NLP and larger training data/corpus, the algorithms will be able to write increasingly more like human writing. I think most of the time that is the objective of the continuous training of AW algorithms. If algorithms can write more like human beings, they can assist and even replace/automated more human work, and that’s the purpose for many companies to develop AW algorithms (considering their efficiency, cost, and capability of producing massive content in a short time). But when algorithms can write increasingly like human beings, I think it is necessary and ethical to consider the byline and authorship of automated content. Byline information is related to the information transparency, credibility, and responsibility of the content (think about the fake news bots on Twitter during the 2016 presidential election). AW authorship is another bid topic. In short, I think AW challenges the traditional human-centered concept of authorship and blurs the human-nonhuman binary boundaries as AW algorithms are performing the key writing tasks that used to only belong to human writers. I am working on another project on this topic. 

Yingying Tang

I’m sorry Jane, I hurried to write you a reply on a plane and I just realized I misunderstood your feelings about coding writing. I’m sure many people, especially people in our field, feel uncomfortable about using AI to automatically generate content. This practice challenges many key concepts in our field, like author, audience, authorship, genres, etc.

Regarding your question about my coding methods, for the interview study, I use semi-structured interviews to collect my qualitative data. So rather than doing content analysis on my interview transcripts, I mostly look at whether the interviews can answer my research questions, and if not, what follow-up questions should I prepare for the next round of interviews (I have four rounds of interviews). As for the qualitative data in the upcoming content assessment study, I will collect data from participants’ comments and follow-up interviews. I will develop coding strategies based on the collected data with an aim to detect patterns in the readers’ comments and perceptions.

Jane Vaughan

Thank you so much for explaining your coding methods; that iterative process is similar to what I use as well. I am somewhat neutral on the issue of AI writing, to be truthful; I’m not that familiar with it from a rhetorical standpoint and I think studying it is definitely valuable! I knew that many of the major news stations used it, but I wasn’t aware that it’s prevalent in so many other areas of writing.

Yingying Tang

Ture! We hardly realize AW is permeating into every corner of our life. I am quoting my previous article here: “The news stories we read may be written by news bots; our online comments are scrutinized, evaluated, and filtered by comment-moderating algorithms; on Facebook and Twitter, social bots are working day and night to reply to posts and interact with human users; in the apps of Bank of America, Lyft, and Spotify, customer service bots are answering customer queries and providing intelligent content 24/7, year-round. We read automatedly generated product descriptions on Amazon; we communicate with smart speakers like Google Home, Amazon Alexa, and Apple HomePod; and we learn our financial status from the visualized, annotated reports generated by our banks’ AW systems.”

But AW is far more than the abovementioned examples if we consider its applications in education, medical inquiry, legal documents, and old-age care. 

Tang, Y. (2020, October). Promoting User Advocacy through Design Thinking in the Age of Automated Writing. In Proceedings of the 38th ACM International Conference on Design of Communication (pp. 1-6).

Last edited 1 year ago by Yingying Tang
Kylie Jacobsen

Thank you for sharing your pilot study research on such an interesting topic! I’m especially intrigued by the future implications of the results.

I was wondering if you could talk about the significance of the 39% of readers who believed the AI-written material to be human-written? Is that percentage high? Not high enough? Could your research also provide a metric that we can refer to in order to determine if AI-writing has achieved optimal description, coherency, writing, usability, pleasantness, and interest as human-writers are perceived? Should it?

I was also wondering if you could talk more about your participants. What demographics do they share? How many did you recruit? Have you considered assessing reading literacy as a way to determine a baseline understanding of reader perceptions (maybe it’s not necessary)?

Yingying Tang

Thank you for your question, Kylie! Having participants read and compare two similar articles written by a human writer and an algorithm and decide which one is written by the human writer is basically a Turing test! I think 39% is a pretty significant number as that means 39% of the participants were “fooled” by the algorithm, thinking AI was a human writer while the real human writer was a bot. For these participants, the AI algorithm surpasses its model of writing. And as AW algorithms are still quickly developing, I think it can be even more difficult to ditinguish AW from human writing in the future.

My pilot study only collected quantitative data from a 5-point Likert scale and my metrics are numeratic. I think the quantitative data only tell me that there are differences in the reader perceptions, but I also want to understand why and how participants perceive differently. That’s why I decide to also collect reader comments and conduct post-survey interviews in my follow-up study.

For recruitment, I used a convinient sample and spread the survey form among my classmates and students. I didn’t collect any demographic information from the pilot study, but I think it would be intersting to see how gender, age, literacy level may influence readers’ perceptions if the number of participants is large enough. I think reading literacy can be important and even crutial in user perception research related to AW’s applications in basc medical information, legal information, and education for disenfranchised groups of people. AW content is useless if its target user cannot understand them. Hence it is important to recruit people that have similar literacy level to the target user group if the purpose of the study is to ensure the AW content is usable.

danielliddle

The comparison of the human and AI writers is very interesting here, especially in the fact that the automated writing was read as more objective. I would look forward to how that quality factors into your future research on the specific features of text that lead toward or away from that objectivity. It reminds me of that study several years ago that readers see text written in Baskerville as more trustworthy.

I wanted to ask, what did you learn from the interviews with the two company affiliates? How did that affect your content assessment or stir directions for future study?

Yingying Tang

Thank you for your comment, Daniel! I was also excited when I found participants thought the AI-generated report was more informative, trustworthy, accurate, and objective, while the human writing was more interesting and more pleasant to read even. If they knew who was the author, I wouldn’t be that much surprised as that difference fits the stereotype we have for human and algorithmic writing. But when they made the evaluation without knowing the author information, it is interesting to investigate further why they perceive the two dimilar articles so differnerently.
Based on my pilot interviews, I learned about practitioners’ attitudes about the unique strength of human writers and algothms. You question actually inspires me to think that I can design questions to ask my future conent assessment participants (I will recruite them from students who have taken technical writing courses) about their understanding of the strengths and drawbacks of human/algorithmic writers and compare their ideas with the opinions of the practitioners. That may help us rethink about our technical communication pedagogy in the automated writing era (what is good and what is lacking, etc.).

ameliachesley

this sounds so fascinating! I am intrigued to see where you might go next with this research. Jane’s idea to plan an interdisciplinary study sounds very neat.

Yingying Tang

Thanks so much, Amelia! I do think Jane’s suggestion is great!

Joseph Bartolotta

Yingying,
This is all very interesting! Good work on the study so far! Can you tell us a little more about the texts that were evaluated? What topics were they generally about and about how many words were they? For my own purposes, I have wondered if AW deteriorates in quality as texts get longer. Is this something your study may shed light on?
Great work so far!

Yingying Tang

Thank you for your encouragement and questions, Joseph! The texts I used for the pilot study are two short sports news both around 150 words. Though my study doesn’t look at how the length may influence the quality of the content, I may be able to provide some information for your reference. When Washington Post’s AW algorithm, Heliograf, was first used in the 2016 Rio Olympics, it could only generate short multi-sentence updates for readers. But with the advances in technology, it now can produce article-length news reports.
Personally, I think usually longer articles (especially in our field) involve more in-depth thinking and more complex logic, and that is the area AW algorithms are not good at. But if the genre is simply a business report or product description, I think the length of the article won’t matter that much: as long as there is sufficient data, the algorithm can generate such routine, rule-based content indefinitely.

15
0
Would love your thoughts, please comment.x
()
x
| Reply