Please use this identifier to cite or link to this item: http://dspace.iiitb.ac.in:8080/handle/123456789/69
Full metadata record
DC FieldValueLanguage
dc.contributor.authorMutalikdesai, Mandar R-
dc.date.accessioned2020-08-19T08:11:31Z-
dc.date.available2020-08-19T08:11:31Z-
dc.date.issued2013-02-02-
dc.identifier.otherPH2005902-
dc.identifier.urihttp://dspace.iiitb.ac.in:8080/handle/123456789/69-
dc.descriptionviii, 150p.en_US
dc.description.abstractAn information space is a large collection of content generated by human actors. Examples of information spaces include the Web, digital libraries and social media. These spaces hold signi cant amounts of latent semantics, which may be relevant to various stakeholders such as market data analysts, marketing research personnel, etc. In this thesis, we address the problem of mining latent semantics from information spaces. We note that information spaces are not uniformly similar in nature. They can be classi ed into two types: repository spaces and social spaces. Examples of repository spaces are the Web and digital libraries, while examples of social spaces are the blogosphere and wikis. Content is generated in both these spaces by cognitive processes of actors, with the content manifesting as documents, web pages, blog posts, reviews, articles, tweets, etc. During the creation of such content, the actor typically embeds her individual world-views (opinions, feelings, etc.) into the content. Also, in both these spaces, there exist social interactions between the cognitive processes, which manifest as references between documents in the form of links or citations, comments to blog posts, rebuttals to criticisms, responses to reviews, etc. However, repository spaces and social spaces di er in terms of the boundary of social interaction, the scope of engagement of actors, and the localized synchrony of social interactions. While the social interactions in repository spaces can be spread across the entire repository space, the social interactions in social spaces have wellde ned boundaries within which the commonly held world-views of multiple actors can emerge in a focused manner. We call this boundary within which cognitive processes interact as a socio-cognitive process. While the entire repository represents a single socio-cognitive process, there exist multiple socio-cognitive processes within a social space. Also, in a repository space, the engagement of actors is typically limited to editing only a few documents, since they can edit only those documents that are owned by them. In social spaces, on the other hand, actors can engage in multiple socio-cognitive processes, typically even if those socio-cognitive processes are not owned by them. Another aspect in which repository spaces and social spaces differ is the localized synchrony of social interactions. As we have observed, the social interactions in a repository space are not localized. Also, these interactions between cognitive processes do not exhibit synchrony. By this, we mean that these social interactions are not in tune with each other over time within the socio-cognitive process. On the other hand, the social interactions in a social space are not only localized to a well-de ned socio-cognitive process, but also synchronous. Such social interactions largely take place within short durations of each other in a localized manner. In the rst part of this thesis, we posit that semantics are the commonly held world-views of actors, which emerge in a socio-cognitive process in both, repository spaces as well as social spaces. We rely on the co-occurrence analysis of artifacts such as concepts (e.g., named entities) and social interactions (e.g., citations) within an information space in order to mine semantics. We assert that co-occurrence analysis embodies a fundamental principle of human cognition known as the Hebbian Theory. We also assert that cooccurrence analysis is a manifestation of the principles of Ordinary Language Philosophy, which states that the meaning of a term depends upon its usage with other terms. Further, we argue that co-occurrence analysis not only helps in identifying the meaning of a concept, but also its semantic associations relative to other concepts. We test this hypothesis in both, repositories and social spaces. In the second part of this thesis, we analyze the co-occurrences of citations (or co-citations) to discover endorsed citations in a repository space. Given a document, an endorsed citation is an outgoing citation whose target document is deemed by other co-citing documents to be more relevant to the source document than the targets of other outgoing citations from the source document. We envisage the use of endorsed citations in focused resource discovery and relevance ranking in repository spaces. In the third part of this thesis, we analyze the co-occurrences of concepts (terms) in a social space to detect object-attribute relationships. Given an object (i.e. a concept), we de ne as its attributes those concepts that, besides being semantically related to the object, help in collectively describing the object uniquely. We assert that the attributes of an object tend to co-occur with the object across cognitive contexts (paragraphs, article-sections, documents, etc.) in a social space. We present two co-occurrence based hypotheses for identifying objectattribute relationships between concepts. We envisage that the discovery of the semantic attributes of an object has applications in social media analytics, e.g., (i) marketing research personnel looking to nd out how the population characterizes their product, and (ii) classi cation of concepts within an encyclopedic environment like Wikipedia.en_US
dc.language.isoenen_US
dc.publisherInternational Institute of Information Technology Bangaloreen_US
dc.subjectComputer Scienceen_US
dc.subjectComputer Science Information Systemsen_US
dc.subjectEngineering and Technologyen_US
dc.titleSemantics extraction in information spaces using co occurance analysisen_US
dc.typeThesisen_US
Appears in Collections:1. PHD Thesis

Files in This Item:
File Description SizeFormat 
PH2005902-Mandar R Mutalikdesai .pdfPH2005902-Mandar R Mutalikdesai Thesis1.96 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.