I am a Ph.D. candidate in the computer vision lab (CVLab) at Stony Brook University. My advisor is Prof. Dimitris Samaras. Before that, I received my Master and Bachelor degree from the School of Electronic and Information Engineering at South China University of Technology (SCUT).

My research interest lies in broad computer vision and graphics, including texture analysis, 3D estimation, inverse rendering, image-based rendering.

Curriculum Vitea

Email: kemma[AT]cs.stonybrook.edu

Address: Stony Brook University, Stony Brook, NY, 11790

Publication

A Joint Spatial and Magnification Based Attention Framework for Large Scale Histopathology Classification
Jingwei Zhang, Ke Ma, John Van Arnam, Rajarsi Gupta, Joel Saltz, Maria Vakalopoulou, and Dimitris Samaras
CVPR Workshops 2021 (CVMI best paper award)

Conventional deep learning methods cannot handle the enormous image sizes; instead, they split the image into patches which are exhaustively processed, usually through multi-instance learning approaches. Moreover and especially in histopathology, determining the most appropriate magnification to generate these patches is also exhaustive: a model needs to traverse all the possible magnifications to select the optimal one. We propose a novel spatial and magnification based attention sampling strategy by using a small image thumbnail to determine the spatial location and magnification of informative regions and classify the gigapixel original image based on few patches sampled from these regions. Our experiments on BACH and a subset of the TCGA-PRAD dataset demonstrate that the proposed method runs 2.5 times faster while still maintaining comparable accuracy.

Paper

Intrinsic Decomposition of Document Images In-the-Wild
Sagnik Das, Hassan Ahmed Sial, Ke Ma, Ramon Baldrich, Maria Vanrell, and Dimitris Samaras
BMVC 2020

Performance of automatic document content processing is often affected by artifacts caused by the shape of the paper, non-uniform and diverse lighting conditions. In this paper, we proposed a learning-based architecture that directly estimates document reflectance based on intrinsic image formation which generalizes to challenging illumination conditions. We also created a new dataset, Doc3dShade, improves previous synthetic ones, by adding a large range of realistic and diverse multi-illuminant conditions.

Paper Project

DewarpNet: Single-Image Document Unwarping with Stacked 3D and 2D Regression Networks
Sagnik Das*, Ke Ma*, Zhixin Shu, Dimitris Samaras, and Roy Shilkrot
ICCV 2019

In this work, we propose DewarpNet, a deep-learning approach for document image unwarping from a single image. Our insight is that the 3D geometry of the document not only determines the warping of its texture but also causes the illumination effects. Therefore, our novelty resides on the explicit modeling of 3D shape for document paper in an end-to-end pipeline. Also, we contribute the largest and most comprehensive dataset for document image unwarping to date – Doc3D. This dataset features multiple ground-truth annotations, including 3D shape, surface normals, UV map, albedo image, etc. Training with Doc3D, we demonstrate state-of-the-art performance for DewarpNet with extensive qualitative and quantitative evaluations.

Paper Project

DocUNet: Document Image Unwarping via A Stacked U-Net
Ke Ma, Zhixin Shu, Xue Bai, Jue Wang, and Dimitris Samaras
CVPR 2018

Capturing document images is a common way for digitizing and recording physical documents due to the ubiquitousness of mobile cameras. To make text recognition easier, it is often desirable to digitally flatten a document image when the physical document sheet is folded or curved. In this paper, we develop the first learning-based method to achieve this goal. We propose a stacked U-Net with intermediate supervision to directly predict the forward mapping from a distorted image to its rectified version. We create a synthetic dataset with approximately 100 thousand images by warping non-distorted document images. We further create a comprehensive benchmark that covers various real-world conditions. We evaluate the proposed model quantitatively and qualitatively on the proposed benchmark, and compare it with previous nonlearning-based methods.

Paper Project

Large-Scale Continual Road Inspection: Visual Infrastructure Assessment in the Wild
Ke Ma, Minh Hoai, and Dimitris Samaras
BMVC 2017

This work develops a method to inspect the quality of pavement conditions based on images captured from moving vehicles. Our first contribution in this paper is the development of a method to create a large-scale dataset of pavement images. Specifically, using map and GPS information, we match the ratings by government inspectors found in public databases to Google Street View images, creating a dataset containing more than 700K images from 70K street segments. We use the dataset to develop a deep-learning method for road assessment, which is based on Convolutional Neural Networks, Fisher Vector encoding, and UnderBagging random forests.

Paper Project

Texture classification for rail surface condition evaluation
Ke Ma, Tomás F. Yago Vicente, Dimitris Samaras, Michael Petrucci, and Daniel L. Magnus
WACV 2016

Rail surface defects threaten train and passenger safety. Hence rail surfaces must be restored using different processes depending on measurement of the severity of the defects. In this paper, we propose a new method for automatic classification of rail surface defect severity from images collected by rail inspection vehicles. It contains 2 components: a rail surface segmentation module, which utilizes structured random forests to generate an edge map and a Generalized Hough Transform to locate the boundaries of the rail surface; and a defect severity classification module, which combines multiple SVM classifiers through a stacked ensemble model. Our experiments on a dataset of 939 images categorized into 8 severity levels achieved 82% accuracy.

Paper

The structure–mechanical relationship of palm vascular tissue
Ningling Wang, Wangyu Liu, Jiale Huang, and Ke Ma
Journal of the mechanical behavior of biomedical materials 2014

To study the structure–mechanical relationship of palm sheath, the cellular structure of the vascular tissue is rebuilt with an image-based reconstruction method and used to create finite element models. The validity of the models is firstly verified with the results from the tensile tests. Then, the cell walls inside each of the specific regions (fiber cap, vessel, xylem, etc.) are randomly removed to obtain virtually imperfect structures. By comparing the magnitudes of performance degradation in the different imperfect structures, the influences of each region on the overall mechanical performances of the vascular tissue are discussed.


Other Projects

Medraw

Medraw is a web-based interface to facilitate pathologists to annotate regions on histopathology images. The front end is built upon Bootstrap and utilizes canvas tag overlaying a transparent painting layer on the image. The interface is mobile/touch device friendly so pathologists can annotate images by swiping on tablets. The backend is a simple PHP server which can load the previous annotation or save the current annotation. The interface is for internal use only. A snapshot is shown below.

Snapshot

Design from God of War. Drawn by u/Yfelody

Mimir

Mimir is built to better browse over every day's ArXiv Papers. The backend server has a scheduled task downloading all the ArXiv papers that are tagged as computer vision every day. All the pdfs are converted into images on the server as images are easier to load than pdfs for the end devices. On the client end, papers are organized as a list of cards. Each card shows the title, author list, and some meta-information like if the paper is published. A card contains a clickable gallery of images showing the content of the paper. Thus it is convenient to review every day's new papers on mobile devices in a Twitter-reading style.

Mimir

Misc

I am a big fan of video games. My favorite game is Okami from Clover studio at CAPCOM. The game is originally on PS2 and later remastered on other platforms. I beat the game once on PS2 and once on PS4 with the platinum trophy. I also like Final Fantasy series. I played FF 6, 7, 8, 9, 10, 12, 13, 15 but only beat 7 Core Crisis, 7 Dirge of Cerberus, and 15 (platinum trophy). Other games I beat includes Kingdom Hearts 1, 2, 2.8, 3, Uncharted 1, 2, 3, 4, Ratchet & Clank series, Rockman (X) series, and so on so forth. Recently I have shifted my gaming time on NS more. I just beat Zelda: Breath of the Wild and Link's Awakening, Octpath Traveler.

I also love anime and manga. My favorite character is Saitama from One Punch Man by the manga artist ONE and Yusuke Murata.

My other hobbies includes painting and drawing, origami, and badminton.