|عنوان فارسی مقاله:||برجستگی، توجه و جستجوی تصویر: رویکرد نظریه اطلاعاتی|
|عنوان انگلیسی مقاله:||Saliency, attention, and visual search: An information theoretic approach|
|رشته های مرتبط:||مهندسی کامپیوتر، مهندسی نرم افزار و برنامه نویسی|
|فرمت مقالات رایگان||مقالات انگلیسی و ترجمه های فارسی رایگان با فرمت PDF میباشند|
|کیفیت ترجمه||کیفیت ترجمه این مقاله پایین میباشد|
مقاله انگلیسی رایگان
|دانلود رایگان مقاله انگلیسی|
ترجمه فارسی رایگان
|دانلود رایگان ترجمه مقاله|
|جستجوی ترجمه مقالات||جستجوی ترجمه مقالات مهندسی کامپیوتر|
بخشی از ترجمه فارسی مقاله:
طرحی پیشنهادی برای محاسبات برجستگی در حوزه تصویر که بر اساس این فرض مطرح شده است که محاسبه برجستگی متمرکز شده در یک نقطه را برای به حداکثر رساندن اطلاعات نمونه برداری شده از محیط یک شخص به حداکثر می رساند. این مدل به طور کامل در محدودیت های محاسباتی ساخته شده است اما با این وجود در معماری با سلول ها و یادآور اتصالی که در حوزه بصری ظاهر می شود به وجود آمد. آن نشان داد که انواع رفتارهای جستجوی بصری به عنوان ویژگی های مهم این مدل ظاهر می شوند و بنابراین اصول اساسی برنامه نویسی و انتقال اطلاعات به وجود می آیند. نتایج آزمایشی اثربخشی بیشتری را در پیش بینی الگوهای تثبیت در سرتاسر دو مجموعه از داده های مختلف در مقایسه با مدل های رقابتی نشان می دهند.
بخشی از مقاله انگلیسی:
A proposal for saliency computation within the visual cortex is put forth based on the premise that localized saliency computation serves to maximize information sampled from one’s environment. The model is built entirely on computational constraints but nevertheless results in an architecture with cells and connectivity reminiscent of that appearing in the visual cortex. It is demonstrated that a variety of visual search behaviors appear as emergent properties of the model and therefore basic principles of coding and information transmission. Experimental results demonstrate greater efficacy in predicting fixation patterns across two different data sets as compared with competing models.
Humans perform visual search tasks constantly from finding a set of keys to looking for a friend in a crowded place. However, despite the importance of this task and its ubiquity in our everyday lives, the current understanding of the neural underpinnings of this behavior falls short of forming a consensus opinion. The steep drop-off in visual acuity from the fovea to the periphery necessitates an efficient system for directing the eyes onto those areas of the scene that are relevant to satisfying the goals of an observer. Moreover, a related and important task is the direction of the focus of attention cortically; that is, the cortical mechanisms underlying the direction of focused processing onto task relevant visual input. Over the last several decades, a great deal of research effort has been directed toward further understanding the mechanisms that underlie visual sampling, either through observing fixational eye movements, or in considering the control of focal cortical processing. Consideration of fixational eye movements necessarily involves two distinct components, one being the top-down task-dependent influence on these behaviors, and the second characterized by bottom-up stimulus-driven factors governed by the specific nature of the visual stimulus. The importance of the former of these categories is well documented and perhaps most prominently demonstrated by Yarbus (1967). In the experiments of Yarbus, observers were asked a variety of different questions about a specific scene while having their eye movements tracked. The resulting data demonstrates wildly different patterns of eye movements depending on the question posed. More recent efforts have continued in the same vein (Hayhoe & Ballard, 2005; Hayhoe, Shrivastava, Mruczek, & Pelz, 2003; Land, Mennie, & Rusted, 1999), observing eye movements in a variety of real-world settings and further demonstrating the role of task in the direction of visual and presumably cortical sampling. Certain visual events such as a bright flash of light, a vividly colored sign, or sudden movement will almost certainly result in an observer’s gaze being redirected, independent of any task-related factors. These behaviors reflect the bottom-up stimulus-driven component of visual sampling behavior. Even in the absence of such remarkable visual patterns, the specific nature of the visual stimulus at hand no doubt factors appreciably into the visual sampling that ensues. A number of studies have attempted to expound on this area by observing correlation between fixations made by human observers and basic features such as edges or local contrast (Parkhurst, Law, & Niebur, 2002; Tatler, Baddeley, & Gilchrist, 2005). The general finding of such studies is that there is no simple single basic feature that adequately characterizes what comprises salient content across all images. An additional limitation of such an approach is that any result of such a study says little about the underlying neural basis for such computation or the corresponding neural implementation. An additional domain in which saliency is considered is in the context of attention models that posit the existence of what has been called a saliency map. The introduction of saliency maps came conceptually with Treisman and Gelade’s (1980) Feature Integration Theory in the form of what they describe as a master map of locations. The basic structure of the model is that various basic features are extracted from the scene. Subsequently the distinct feature representations are merged into a single topographical representation of saliency. In later work this representation has been deemed a saliency map and includes with it a selection process that in vague terms selects the largest peak in this representation, and the spotlight of attention moves to the location of this peak (Koch & Ullman, 1985). In this context, the combined pooling of the basic feature maps is referred to as the saliency map. Saliency in this context then refers to the output of an operation that combines some basic set of features into a solitary representation. Although models based on a saliency map have had some success in predicting fixation patterns and visual search behavior, there exists one significant methodological shortcoming of the definition of saliency captured by these saliency map models. The definition of saliency is emergent from a definition of local feature contrast that is loosely based on observations concerning interaction among cells locally within primate visual cortex. Although the models succeed at simulating some salience-related behaviors, they offer little in explaining why the operations involved in the model have the structure that is observed and, specifically, what the overall architecture translates into with respect to its relationship to the incoming stimulus in a principled quantitative manner. As such, little is offered in terms of an explanation for design principles behind observed behavior and the structure of the system. In this paper, we consider the role that the properties of visual stimuli play in sampling from the stimulus-driven perspective. The ambition of this work lies in explaining why certain components implicated in visual saliency computation behave as they do and also presents a novel model for visual saliency computation built on a first principles information theoretic formulation dubbed Attention based on Information Maximization (AIM). This comprises a principled explanation for behavioral manifestations of AIM and contributions of this paper include: 1. A computational framework for visual saliency built on first principles. Although AIM is built entirely on computational constraints, the resulting model structure exhibits considerable agreement with the organization of the human visual system. 2. A definition of visual saliency in which there is an implicit definition of context. That is, the proposed definition of visual salience is not based solely on the response of cells within a local region but on the relationship between the response of cells within a local region and cells in the surrounding region. This includes a discussion of the role that context plays in the behavior of related models. 3. Consideration of the impact of principles underlying neural coding on the determination of visual saliency and visual search behavior. This includes a demonstration that a variety of visual search behaviors may be seen as emergent properties of principles underlying neural coding combined with information seeking as a visual sampling strategy. 4. A demonstration that the resulting definition of visual saliency exhibits greater agreement with fixational eye movement data than existing efforts. As a whole, we establish that an information maximization strategy for saliency-related neural gain control is consistent with the computation observed in the visual cortex. These results are discussed in terms of implications with respect to how attentional selection in general is achieved within the visual cortex.