Home About the Journal Latest Work Current Issue Archive Special Issues Editorial Board

2020, Vol. 2 No. 2 Publish Date:2020-4

Previous Next
View Abstracts Download Citations


Reference Manager





Visual interaction and its applications

2020, 2(2) : 1-2


PDF (28) HTML (521)


Intelligent virtualization of crane lifting using laser scanning technology

2020, 2(2) : 87-103


Abstract (885) PDF (22) HTML (924)
This paper presents an intelligent path planner for lifting tasks by tower cranes in highly complex environments, such as old industrial plants that were built many decades ago and sites used as tentative storage spaces. Generally, these environments do not have workable digital models and 3D representations are impractical.
The current investigation introduces the use of cutting-edge laser scanning technology to convert real environments into virtualized versions of the construction sites or plants in the form of point clouds. The challenge is in dealing with the large point cloud datasets from the multiple scans needed to produce a complete virtualized model. The tower crane is also virtualized for the purpose of path planning. A parallelized genetic algorithm is employed to achieve intelligent path planning for the lifting task performed by tower cranes in complicated environments taking advantage of graphics processing unit technology, which has high computing performance yet low cost.
Optimal lifting paths are generated in several seconds.
Personalized cardiovascular intervention simulation system

2020, 2(2) : 104-118


Abstract (675) PDF (15) HTML (568)
This study proposes a series of geometry and physics modeling methods for personalized cardiovascular intervention procedures, which can be applied to a virtual endovascular simulator.
Based on personalized clinical computed tomography angiography (CTA) data, mesh models of the cardiovascular system were constructed semi-automatically. By coupling 4D magnetic resonance imaging (MRI) sequences corresponding to a complete cardiac cycle with related physics models, a hybrid kinetic model of the cardiovascular system was built to drive kinematics and dynamics simulation. On that basis, the surgical procedures related to intervention instruments were simulated using specially-designed physics models. These models can be solved in real-time; therefore, the complex interactions between blood vessels and instruments can be well simulated. Additionally, X-ray imaging simulation algorithms and realistic rendering algorithms for virtual intervention scenes are also proposed. In particular, instrument tracking hardware with haptic feedback was developed to serve as the interaction interface of real instruments and the virtual intervention system. Finally, a personalized cardiovascular intervention simulation system was developed by integrating the techniques mentioned above.
This system supported instant modeling and simulation of personalized clinical data and significantly improved the visual and haptic immersions of vascular intervention simulation.
It can be used in teaching basic cardiology and effectively satisfying the demands of intervention training, personalized intervention planning, and rehearsing.
End-to-end spatial transform face detection and recognition

2020, 2(2) : 119-131


Abstract (675) PDF (13) HTML (632)
Several face detection and recognition methods have been proposed in the past decades that have excellent performance. The conventional face recognition pipeline comprises the following: (1) face detection, (2) face alignment, (3) feature extraction, and (4) similarity, which are independent of each other. The separate facial analysis stages lead to redundant model calculations, and are difficult for use in end-to-end training.
In this paper, we propose a novel end-to-end trainable convolutional network framework for face detection and recognition, in which a geometric transformation matrix is directly learned to align the faces rather than predicting the facial landmarks. In the training stage, our single CNN model is supervised only by face bounding boxes and personal identities, which are publicly available from WIDER FACE and CASIA-WebFace datasets. Our model is tested on Face Detection Dataset and Benchmark (FDDB) and Labeled Face in the Wild (LFW) datasets.
The results show 89.24% recall for face detection tasks and 98.63% accuracy for face recognition tasks.
Two-phase real-time rendering method for realistic water refraction

2020, 2(2) : 132-141


Abstract (1010) PDF (32) HTML (628)
Realistic rendering has been an important goal of several interactive applications, which requires an efficient virtual simulation of many special effects that are common in the real world. However, refraction is often ignored in these applications. Rendering the refraction effect is extremely complicated and time-consuming.
In this study, a simple, efficient, and fast rendering technique of water refraction effects is proposed. This technique comprises a broad and narrow phase. In the broad phase, the water surface is considered flat. The vertices of underwater meshes are transformed based on Snell’s Law. In the narrow phase, the effects of waves on the water surface are examined. Every pixel on the water surface mesh is collected by a screen-space method with an extra rendering pass. The broad phase redirects most pixels that need to be recalculated in the narrow phase to the pixels in the rendering buffer.
We analyzed the performances of three different conventional methods and ours in rendering refraction effects for the same scenes. The proposed method obtains higher frame rate and physical accuracy comparing with other methods. It is used in several game scene, and realistic water refraction effects can be generated efficiently.
The two-phase water refraction method produces a tradeoff between efficiency and quality. It is easy to implementin modern game engines, and thus improve the quality of rendering scenes in video games or other real-time applications.
Temporal continuity of visual attention for future gaze prediction in immersive virtual reality

2020, 2(2) : 142-152


Abstract (663) PDF (12) HTML (562)
Eye tracking technology is receiving increased attention in the field of virtual reality. Specifically, future gaze prediction is crucial in pre-computation for many applications such as gaze-contingent rendering, advertisement placement, and content-based design. To explore future gaze prediction, it is necessary to analyze the temporal continuity of visual attention in immersive virtual reality.
In this paper, the concept of temporal continuity of visual attention is presented. Subsequently, an autocorrelation function method is proposed to evaluate the temporal continuity. Thereafter, the temporal continuity is analyzed in both free-viewing and task-oriented conditions.
Specifically, in free-viewing conditions, the analysis of a free-viewing gaze dataset indicates that the temporal continuity performs well only within a short time interval. A task-oriented game scene condition was created and conducted to collect users’ gaze data. An analysis of the collected gaze data finds the temporal continuity has a similar performance with that of the free-viewing conditions. Temporal continuity can be applied to future gaze prediction and if it is good, users’ current gaze positions can be directly utilized to predict their gaze positions in the future.
The current gaze’s future prediction performances are further evaluated in both free-viewing and task-oriented conditions and discover that the current gaze can be efficiently applied to the task of short-term future gaze prediction. The task of long-term gaze prediction still remains to be explored.
On attaining user-friendly hand gesture interfaces to control existing GUIs

2020, 2(2) : 153-161


Abstract (677) PDF (10) HTML (524)
Hand gesture interfaces are dedicated programs that principally perform hand tracking and hand gesture prediction to provide alternative controls and interaction methods. They take advantage of one of the most natural ways of interaction and communication, proposing novel input and showing great potential in the field of the human-computer interaction. Developing a flexible and rich hand gesture interface is known to be a time-consuming and arduous task. Previously published studies have demonstrated the significance of the finite-state-machine (FSM) approach when mapping detected gestures to GUI actions.
In our hand gesture interface, we broadened the FSM approach by utilizing gesture-specific attributes, such as distance between hands, distance from the camera, and time of occurrences, to enable users to perform unique GUI actions. These attributes are obtained from hand gestures detected by the RealSense SDK employed in our hand gesture interface. By means of these gesture-specific attributes, users can activate static gestures and perform them as dynamic gestures. We also provided supplementary features to enhance the efficiency, convenience, and user-friendliness of our hand gesture interface. Moreover, we developed a complementary application for recording hand gestures by capturing hand keypoints in depth and color images to facilitate the generation of hand gesture datasets.
We conducted a small-scale user survey with fifteen subjects to test and evaluate our hand gesture interface. Anonymous feedback obtained from the users indicates that our hand gesture interface is adequately facile and self-explanatory to use. In addition, we received constructive feedback about minor flaws regarding the responsiveness of the interface.
We proposed a hand gesture interface along with key concepts to attain user-friendliness and effectiveness in the control of existing GUIs.
COMTIS: Customizable touchless interaction system for large screen visualization

2020, 2(2) : 162-174


Abstract (698) PDF (8) HTML (567)
Large screen visualization systems have been widely utilized in many industries. Such systems can help illustrate the working states of different production systems. However, efficient interaction with such systems is still a focus of related research.
In this paper, we propose a touchless interaction system based on RGB-D camera using a novel bone-length constraining method. The proposed method optimizes the joint data collected from RGB-D cameras with more accurate and more stable results on very noisy data. The user can customize the system by modifying the finite-state machine in the system and reuse the gestures in multiple scenarios, reducing the number of gestures that need to be designed and memorized.
The authors tested the system in two cases. In the first case, we illustrated a process in which we improved the gesture designs on our system and tested the system through user study. In the second case, we utilized the system in the mining industry and conducted a user study, where users say that they think the system is easy to use.