1 Introduction 1.1 AbstractRefer It Game is a two-player game where users alternatebetween generating expressions referring to objects in images of naturalscenes, and clicking on the locations of described objects. The purpose of thisgame is to crowdsource natural language referring expressions, very importantfor research on natural language generation and dialogue systems such asApple’s Siri and Amazon Alexa. Creating a two-player game, allows to gather andexamine referring expressions directly within the game, which later allows toperform experimental evaluations on collected dataset. 1.2 ContextEvery day, people around the world communicate with eachother in various ways, but what we discuss, talk and debate about, mostlyconcerns the visual world surrounding us. This make understanding theconnection between objects in the physical world and language describing these objectsa very important but challenging issue for artificial intelligence (AI).
Artificial intelligence is technology that is designed tolearn and self-improve. Automation, machine learning or natural languageprocessing are just few examples of many, various functions AI perform for uson daily basis. This creates a large range of research fields that can profitfrom a better comprehension of how people refer to physical objects in ourworld.
Recent progress in automatic computer vision techniques,have begun to create technologies for perceiving and distinguishing a largenumber of object categories very promising (Perronnin et al., 2012; Deng etal., 2012; Deng et al., 2010; Krizhevsky et al., 2012). As a result, there hasbeen a surge of recent work trying to estimate higher level semantics, comprisingexciting attempts to generate natural language descriptions of images automatically.Such approaches, however, are often associated with problemswhere descriptions may be highly dependent on the task, open-ended anddifficult to automatically evaluate.
This is why we need different but relatedapproach to problem of referring expression generation (REG). By creating available online, two-player game whereindividuals refer to objects in composite images of scenes from surrounding us world,we enable researchers to retrieve not only referring expressions but alsorelevant information. Collected dataset, can then be deeply analysed and laterevaluated. 2 Literature & Technology ReviewLiterature2.1 CrowdsourcingCrowdsourcing simply refers to a method of fund sourcing inwhich organizations or individuals use contributions from internet users toachieve a set objective. The word was adopted in 2005 and seems to combine theword ‘crowd’ and outsourcing.
The beliefis that crowdsourcing has to do with outsourcing work to a crowd people.There’s a difference between crowdsourcing and outsourcing because, withcrowdsourcing, the work can originate from an undefined public (rather than apredetermined group). Some of the main benefits of using crowdsourcing includeimproved speed, adaptability, costs, quality, diversity, or versatility(Buettner, 2015).Crowdsourcing has been highly beneficial in gatheringhigh-quality gold standard used in making automatic systems in natural languageprocessing. Promoted by efforts like the ESP game (von Ahn and Dabbish, 2004)and Peekaboom (von Ahn et al., 2006), Human Computation based games can be aviable approach to engage users and gather vast quantity of data inexpensively.Two player games can likewise automate verification of human providedannotations.
2.1.1 AmazonMechanical TurkAmazon Mechanical Turk (MTurk) simply refers to an onlinecrowdsourcing marketplace that makes it possible for businesses and individualsto organize the use of human intelligence to carry out tasks that cannotcurrently be performed by computers. It’s a website that is owned by Amazon.
Jobs known as Human Intelligence Tasks can be posted by employers(HITs), such as writing descriptions, picking the very best among multiple imagesof a storefront, or identifying performance in music recordings. So-calledworkers can later search through a large collection of existing jobs; they cancomplete these jobs in exchange for monetary rewards as fixed by the employer.The requesting programs place jobs using an API, or the more limited MTurkRequester site which seems to be more limited. For a requester to submit an order to be accomplished through theMechanical Turk platform, he has to submit a billing address in one of about 30approved countries.
2.1.2 CrowdFlowerCrowdFlower refers to a San Francisco based crowdsourcingand data mining company. The company provides a software solution with whichusers can gain access to an online workforce to label, clean, and enrichdata. CrowdFlower uses an onlineworkforce to clean up messy and incomplete data. Majority of CrowdFlower usersare data scientists who use the solution to build training models as well as machinelearning algorithms.
As soon as data is uploaded into the system, the work isautomatically allocated to contributors and is tested against establishedanswers which are hidden within the task (this is called “job” inCrowdFlower). The system trust individuals based on the way they perform onthese hidden tasks. Contributors are allowed to continue working on aparticular job as long as they are still trusted. If they lose that trust, theylose the job, and their work is disregarded. The judgments of many contributorsare collated and the result is given based on aggregate answers with anassociated confidence score (contributors’ agreement weighed by the trust ofeach contributor).