Dr. Philipp Harzig
Phone: | N/A |
Email: | philipp.harzig@informatik.uni-augsburginformatik.uni-augsburg.de () |
Research Interests
- Computer Vision
- Machine Learning
- Visual Question Answering
- Image Captioning
- Deep Convolutional Neural Networks
- Recurrent Neural Networks
- Machine Learning Optimization
Projects
- Automatic Image Captioning of Scenes Depicting Branded Products
- Company Logo Detection
PhD Thesis
Philipp Harzig. Automatic Generation of Natural Language Descriptions of Visual Data: Describing Images and Videos using Recurrent and Self-Attentive Models
Dissertation, University of Augsburg, February 04, 2022.
Publications
2022 |
Philipp Harzig. 2022. Automatic generation of natural language descriptions of visual data: describing images and videos using recurrent and self-attentive models. Dissertation, Universität Augsburg. . |
Katja Ludwig, Philipp Harzig and Rainer Lienhart. 2022. Detecting arbitrary intermediate keypoints for human pose estimation with vision transformers. In Winter Conference on Applications of Computer Vision (WACV), Waikoloa, Hawaii, USA, January 4-8, 2022. IEEE, Piscataway, NJ, 663-671 DOI: 10.1109/WACVW54805.2022.00073 |
Philipp Harzig, Moritz Einfalt and Rainer Lienhart. 2022. Synchronized audio-visual frames with fractional positional encoding for transformers in video-to-text translation. In Yannick Berthoumieu, Pascal Frossard, Giuseppe Valenzise, Thomas Maugey (Eds.). 2022 IEEE International Conference on Image Processing (ICIP), 16-19 October 2022, Bordeaux, France. IEEE, Piscataway, NJ, 2041-2045 DOI: 10.1109/ICIP46576.2022.9897804 |
2021 |
Debesh Jha, Sharib Ali, Steven Hicks, Vajira Thambawita, Hanna Borgli, Pia H. Smedsrud, Thomas de Lange, Konstantin Pogorelov, Xiaowei Wang, Philipp Harzig, Minh-Triet Tran, Wenhua Meng, Trung-Hieu Hoang, Danielle Dias, Tobey H. Ko, Taruna Agrawal, Olga Ostroukhova, Zeshan Khan, Muhammad Atif Tahir, Yang Liu, Yuan Chang, Mathias Kirkerød, Dag Johansen, Mathias Lux, Håvard D. Johansen, Michael A. Riegler and Pål Halvorsen. 2021. A comprehensive analysis of classification methods in gastrointestinal endoscopy imaging. Medical Image Analysis 70, 102007. DOI: 10.1016/j.media.2021.102007 |
2020 |
Stephan Brehm, Philipp Harzig, Moritz Einfalt and Rainer Lienhart. 2020. Learning segmentation from object color. In Jia Li and Shuyuan Zhu (Ed.). 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), 6-8 August 2020, Shenzhen, Guangdong, China. IEEE, Piscataway, NJ, 139-144. DOI: 10.1109/mipr49039.2020.00036 |
Philipp Harzig, Moritz Einfalt, Katja Ludwig and Rainer Lienhart. 2020. Transforming Videos to Text (VTT Task) Team: MMCUniAugsburg. In TREC Video Retrieval Evaluation (TRECVID) 2020, virtual, 8-11 December 2020. |
2019 |
Philipp Harzig, Yan-Ying Chen, Francine Chen and Rainer Lienhart. 2019. Addressing data bias problems for chest x-ray image report generation. In 30th British Machine Vision Conference, 9 - 12 September 2019, Cardiff, UK. |
Philipp Harzig, Moritz Einfalt and Rainer Lienhart. 2019. Automatic disease detection and report generation for gastrointestinal tract examinations. In Laurent Amsaleg, Benoit Huet, Martha Larson, Guillaume Gravier, Hayley Hung, Chong-Wah Ngo and Wei Tsang Ooi (Ed.). MM '19: The 27th ACM International Conference on Multimedia, Nice, France, October, 2019. ACM Press, New York, NY, 2573-2577. DOI: 10.1145/3343031.3356066 |
Philipp Harzig, Dan Zecha, Rainer Lienhart, Carolin Kaiser and René Schallner. 2019. Image captioning with clause-focused metrics in a multi-modal setting for marketing. In IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), 28 - 30 March 2019, San Jose, CA, USA. IEEE, Piscataway, NJ, 419-424. DOI: 10.1109/MIPR.2019.00085 |
2018 |
Philipp Harzig, Stephan Brehm, Rainer Lienhart, Carolin Kaiser and Rene Schallner. 2018. Multimodal image captioning for marketing analysis. In IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), 10-12 April 2018, Miami, FL, USA. IEEE, Piscataway, NJ, 158-161. DOI: 10.1109/mipr.2018.00035 |
Philipp Harzig, Christian Eggert and Rainer Lienhart. 2018. Visual question answering with a hybrid convolution recurrent model. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval - ICMR '18, June 11 - 14, 2018, Yokohama, Japan. ACM Press, New York, USA, 318-325. DOI: 10.1145/3206025.3206054 |
2016 |
Philipp Harzig. 2016. Implementation of frequency domain convolution for the caffe-framework. Masterarbeit, Universität Augsburg. Universität Augsburg, Augsburg. |
Master Thesis
Philipp Harzig.
Implementation of Frequency Domain Convolution for the Caffe-Framework.
Master Thesis, February 2016.
Teaching
- WS 2019/2020: Advanced Deep Learning [ Digicampus]
- WS 2019/2020: Grundlagen der Signalverarbeitung und des Maschinellen Lernens (Multimedia Grundlagen I) [ Digicampus]
- WS 2019/2020: Grundlagen der Signalverarbeitung und des Maschinellen Lernens für Medizininformatiker [ Digicampus]
- SS 2019: Advanced Deep Learning [ Digicampus]
- WS 2018/2019: Advanced Deep Learning [ Digicampus]
- SS 2018: Praktikum über Autonomes Fahren [ Digicampus]
- SS 2018: Multimedia II: Machine Learning & Computer Vision
- WS 2017/2018: Multimedia Grundlagen I
- SS 2017: Praktikum über Autonomes Fahren [ Digicampus]
- SS 2017: Multimedia II: Machine Learning & Computer Vision
- WS 2016/2017: Multimedia Grundlagen I
- SS 2016: Multimedia II: Machine Learning & Computer Vision