Skip to content

Commit

Permalink
Merge overleaf-2024-11-01-0757 into main
Browse files Browse the repository at this point in the history
  • Loading branch information
yamanksingla authored Nov 1, 2024
2 parents 008ce6e + 1959530 commit aff5441
Show file tree
Hide file tree
Showing 7 changed files with 259 additions and 217 deletions.
11 changes: 9 additions & 2 deletions Conclusion.tex
Original file line number Diff line number Diff line change
@@ -1,9 +1,16 @@
%\addcontentsline{toc}{chapter}{Conclusion and an Outlook for Future Work}
\chapter{Conclusion And An Outlook For Future Work}
\chapter{Conclusion and an Outlook for Future Work}
\label{chapter:conclusion}



This thesis has explored the intersection of communication theory, behavioral science, and artificial intelligence, with a particular focus on understanding and optimizing human behavior through large-scale modeling approaches. Our work builds upon the fundamental seven-factor model of communication—communicator, message, channel, time of receipt, receiver, time of behavior, and receiver's behavior—while leveraging unprecedented access to digital behavioral data to advance both explanatory and predictive approaches to behavioral science.
In the domain of persuasion strategy analysis, we have made significant contributions to understanding the mechanisms of influence in advertising. Through comprehensive research spanning marketing, social psychology, and machine learning literature, we developed the most extensive framework of generic persuasion strategies to date. This work was supported by the creation and release of pioneering datasets for studying persuasion strategies in both image and video advertisements. Our analysis established clear correlations between specific marketing campaign characteristics and measurable customer behaviors, providing valuable insights for both practitioners and researchers in the field of marketing communications.
Our development of Large Content and Behavior Models (LCBMs) represents a fundamental advancement in behavior modeling. Through careful analysis, we revealed that existing Large Language Models (LLMs), despite their remarkable capabilities in various domains, are inherently limited in modeling behavior due to the systematic removal of behavioral data during training. To address this limitation, we developed the LCBM approach, which integrates all seven factors of communication to create more comprehensive models of human behavior. To support future research in this area, we released extensive behavior instruction fine-tuning data derived from over 40,000 YouTube videos and 168 million Twitter posts. Additionally, we established new benchmarks for evaluating joint content-behavior understanding, encompassing both predictive and descriptive tasks.
The thesis has also made significant strides in demonstrating how behavioral signals can enhance content understanding. Our research showed substantial improvements across 46 different tasks spanning 23 benchmark datasets across language, audio, text, and video modalities. We proposed a scalable approach to enhance Vision Language Models (VLMs) without requiring significant architectural changes, making our improvements readily accessible to the broader research community. These results strongly validate our hypothesis that behavioral responses provide valuable signals for content understanding, opening new avenues for improving AI systems' comprehension capabilities.
In the realm of content generation, we made pioneering contributions in both text and visual domains. Through our work on memorability optimization, we developed Henry, which achieved a 44\% improvement in memorability scores through progressive generation techniques. This represents the first successful application of synthetic data in a domain previously lacking large-scale training resources. In the visual domain, we addressed the critical need for engagement-optimized image generation through the development of EngageNet and the creation of EngagingImageNet, a comprehensive dataset of 168 million tweets with associated media and engagement metrics. Our introduction of Engagement Arena, the first automated benchmark for assessing the engagement potential of text-to-image models, provides the research community with a valuable tool for evaluating and improving engagement-oriented image generation techniques.
Looking ahead, this research opens several promising directions for future work. The integration of behavioral data into AI systems could lead to more nuanced and context-aware models that better understand and predict human responses. There is significant potential for extending our approaches to other domains and modalities, particularly in areas where human engagement and response are crucial metrics of success. Additionally, our work on content generation optimization could be expanded to consider multiple behavioral objectives simultaneously, creating content that is not only engaging but also informative, memorable, and persuasive.
Finally, as we stand at the cusp of what we identified as the fourth major phase in the study of communication, driven by unprecedented access to digital content and behavioral data, our work provides a foundation for future researchers to build upon. The tools and methodologies we have developed demonstrate the potential for artificial intelligence to advance our understanding of human behavior and communication, while also highlighting the importance of maintaining a holistic view that encompasses all aspects of the communication process. As these technologies continue to evolve, they promise to provide even deeper insights into human behavior and more effective means of optimizing communication for various objectives.
Our contributions not only advance the field of behavioral science but also provide practical tools and insights for practitioners in marketing, content creation, and communication. By bridging the gap between theoretical understanding and practical application, this thesis lays groundwork for future innovations in both academic research and real-world applications of behavioral science and artificial intelligence.



Expand Down
2 changes: 1 addition & 1 deletion abstract.tex
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ \chapter*{Abstract}
Next, we turn attention towards behavior prediction by constructing general behavior models. These models, similar to large language models, aim to understand behavior \textit{in general}, as opposed to designed for a specific behavioral task. We use the large repositories of digital analytics to train these models. The format of this data is the general communication model consisting of the communicator, message, time of message, channel, receiver, time of effect, and effect. We call these models, Large Content and Behavior Models (LCBMs). We further show that large language models, while being used as general purpose models for a variety of tasks in different domains, are unable to solve behavioral problems. We investigate the reason for this and find that while training LLMs, behavioral data is removed as noise due to which they lose the behavioral capabilities.


We also show that after including the behavioral training data back leads to other positive side effects. Namely, we show that since behavior is an after effect of content (message), therefore, we can make inferences about content by looking at the receiver behavior. An example for this is blood pressure or eye dilation levels upon watching the movie Jurassic Park indicates the excitement level of different scenes. We show results for this hypothesis on more than 30 content understanding tasks across all four modalities of text, image, video, and audio.
We also show that after including the behavioral training data back leads to other positive side effects. Namely, we show that since behavior is an after effect of content (message), therefore, we can make inferences about content by looking at the receiver behavior. An example for this is blood pressure or eye dilation levels upon watching the movie Jurassic Park indicates the excitement level of different scenes. We show results for this hypothesis on more than 40 content understanding tasks across all four modalities of text, image, video, and audio.


Finally, we make initial strides towards solving the problem of generating performant content. We show this both for performant text generation, by taking the illustrative case of the behavior of memorability, and images, by generating images that are more engaging. We also develop mechanisms to measure the engagement potential of text to image generation models. We show that existing metrics to benchmark the quality of text to image generation models are not correlated with engagement. We develop a model to measure the engagement potential of an image. We release the first automated arena to benchmark the engagement of text-to-image models. We rank several popular text-to-image models on their ability to generate engaging images and further encourage the community to submit their models to the arena.
4 changes: 2 additions & 2 deletions iiitd.cls
Original file line number Diff line number Diff line change
Expand Up @@ -326,8 +326,8 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Renewed commands to set the titles of various pages correctly.
\renewcommand\contentsname{\centering TABLE OF CONTENTS}
\renewcommand\listfigurename{\centering LIST OF FIGURES}
\renewcommand\listtablename{\centering LIST OF TABLES}
\renewcommand\listfigurename{\centering List of Figures}
\renewcommand\listtablename{\centering List of Tables}
%\renewcommand\listsymbolname{\centering LIST OF SYMBOLS}
\renewcommand{\chaptername}{CHAPTER}
\renewcommand\bibname{\centering REFERENCES}
Expand Down
Loading

0 comments on commit aff5441

Please sign in to comment.