Information shared on Twitter from bystanders and eyewitnesses can be useful for law enforcement agencies and humanitarian organizations to get firsthand and credible information about an ongoing situation, however, the identification of eyewitness reports on Twitter is a challenging task. We investigate different types of sources on tweets related to eyewitnesses and classifies them into different typesand investigate various characteristics associated with those eyewitness types. We observe that words related to perceptual senses (feeling, seeing, hearing) tend to be present in direct eyewitness messages, whereas emotions, thoughts, and prayers are more common in indirect witnesses. We use these characteristics and labeled data to train several machine learning classifiers. Our results performed on several real-world Twitter datasets reveal that textual features (bag-of-words) when combined with domain-expert features achieve better classification performance.
Download the slides for this talk.Download ( PDF, 31654.66 MB)