Progressive Fault Tolerance in API-First Front-End Applications: Architecture, Patterns, and Evaluation

Authors

  • Althaf Khan Pattan Sr. Engineer, Comcast, Exton, Pennsylvania, USA. Author
  • Somraju Gangishetti Engineering Manager, Forbes Media LLC, Delaware, USA. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V5I1P126

Keywords:

Fault Tolerance, Front-End Architecture, API Resilience, Graceful Degradation, Circuit Breaker, Client-Side State, Progressive Enhancement, Web Application Reliability

Abstract

Modern web applications depend on multiple backend API services to render content, validate user actions, and persist state. When any subset of these APIs becomes slow, intermittent, or fully unavailable, the front-end experience typically collapses into error screens and lost user progress. The work described here contributes a progressive fault tolerance framework for API-first front-end architectures. The framework classifies every API dependency into one of three tiers (hard, soft, deferrable), applies tier-appropriate resilience mechanisms (retry budgets, circuit breakers, stale-while-revalidate caching), and manages a client-side state journal that protects user work across outages. A formal degradation state machine governs transitions between healthy, degraded, offline, and recovering modes, while UI adaptation rules ensure that the interface communicates system status without causing user alarm. Simulated failure injection experiments across four outage scenarios show that the proposed framework raises task completion rates from 41% to 87%, reduces session abandonment from 58% to 14%, and cuts mean recovery time from 18.2 seconds to 2.8 seconds compared to a baseline application with no resilience layer.

References

[1] R. Fielding, "Architectural Styles and the Design of Network-Based Software Architectures," Doctoral dissertation, University of California, Irvine, 2000.

[2] S. Newman, Building Microservices: Designing Fine-Grained Systems, O'Reilly Media, 2015.

[3] M. Nygard, Release It! Design and Deploy Production-Ready Software, 2nd ed., Pragmatic Bookshelf, 2018.

[4] I. Grigorik, High Performance Browser Networking, O'Reilly Media, 2013.

[5] A. Tanenbaum and M. van Steen, Distributed Systems: Principles and Paradigms, 3rd ed., Pearson, 2017.

[6] B. Christensen, "Fault Tolerance in a High Volume, Distributed System," Netflix Technology Blog, 2012. https://netflixtechblog.com/fault-tolerance-in-a-high-volume-distributed-system-91ab4faae74a

[7] R. Bauer and M. Adams, "Resilience4j: Fault Tolerance Library for Java," GitHub Repository, 2020. https://github.com/resilience4j/resilience4j

[8] M. Gaunt, "Service Workers: An Introduction," Google Developers, 2019. https://developers.google.com/web/fundamentals/primers/service-workers

[9] J. Archibald and P. Kinlan, "Workbox: JavaScript Libraries for Progressive Web Apps," Google Developers, 2020. https://developers.google.com/web/tools/workbox

[10] A. Firtman, "Offline First: The New Mobile-First," in Proc. of the Web Directions Conference, 2015.

[11] N. Thompson, "PouchDB: The Database That Syncs," PouchDB Documentation, 2019. https://pouchdb.com/guides/

[12] F. Ocariza, K. Pattabiraman, and B. Zorn, "JavaScript Errors in the Wild: An Empirical Study," in Proc. IEEE International Symposium on Software Reliability Engineering (ISSRE), pp. 100-109, 2011.

[13] B. Moseley and P. Marks, "Out of the Tar Pit," Software Practice Advancement Conference, 2006.

[14] C. Richardson, Microservices Patterns: With Examples in Java, Manning Publications, 2018.

[15] B. Burns, Designing Distributed Systems: Patterns and Paradigms for Scalable, Reliable Services, O'Reilly Media, 2018.

[16] S. Klabnik and C. Nichols, "Layered Error Handling in Distributed Client Applications," ACM Computing Surveys, vol. 49, no. 3, pp. 1-34, 2016.

[17] J. Brutlag, "Speed Matters for Google Web Search," Google Research Blog, 2009. https://research.google/pubs/pub37580/

[18] P. Rossi, "Request Deduplication in High-Throughput Client Applications," in Proc. International Conference on Web Engineering (ICWE), pp. 145-158, 2019.

[19] M. Brooker, "Exponential Backoff and Jitter," AWS Architecture Blog, 2015. https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/

[20] M. Nottingham and M. Liu, "HTTP Cache-Control Extensions for Stale Content," RFC 5861, IETF, 2010. https://tools.ietf.org/html/rfc5861

[21] P. Bailis and A. Ghodsi, "Eventual Consistency Today: Limitations, Extensions, and Beyond," Communications of the ACM, vol. 56, no. 5, pp. 55-63, 2013.

[22] M. Kleppmann, Designing Data-Intensive Applications, O'Reilly Media, 2017.

[23] D. Terry, "Replicated Data Consistency Explained Through Baseball," Communications of the ACM, vol. 56, no. 12, pp. 82-89, 2013.

[24] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley, 1994.

[25] M. Fowler, "CircuitBreaker," Martin Fowler Blog, 2014. https://martinfowler.com/bliki/CircuitBreaker.html

[26] J. Nielsen, "Visibility of System Status," Nielsen Norman Group, 2020. https://www.nngroup.com/articles/visibility-system-status/

[27] S. Souders, High Performance Web Sites: Essential Knowledge for Front-End Engineers, O'Reilly Media, 2007.

[28] B. Shneiderman, C. Plaisant, M. Cohen, S. Jacobs, and N. Elmqvist, Designing the User Interface, 5th ed., Pearson, 2010.

[29] L. Wroblewski, Mobile First, A Book Apart, 2011.

[30] L. Bass, P. Clements, and R. Kazman, Software Architecture in Practice, 3rd ed., Addison-Wesley, 2012.

[31] A. Avizienis, J. Laprie, B. Randell, and C. Landwehr, "Basic Concepts and Taxonomy of Dependable and Secure Computing," IEEE Transactions on Dependable and Secure Computing, vol. 1, no. 1, pp. 11-33, 2004.

[32] V. R. Basili, G. Caldiera, and H. D. Rombach, "The Goal Question Metric Approach," in Encyclopedia of Software Engineering, Wiley, pp. 528-532, 1994.

[33] T. Hoff, "Latency Is Everywhere and It Costs You Sales," High Scalability Blog, 2009. https://highscalability.com/latency-is-everywhere-and-it-costs-you-sales-how-to-crush-it/

[34] P. Bak, C. Tang, and K. Wiesenfeld, "Self-Organized Criticality," Physical Review Letters, vol. 59, no. 4, pp. 381-384, 1987.

[35] M. Shapiro, N. Preguica, C. Baquero, and M. Zawirski, "Conflict-Free Replicated Data Types," in Proc. 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS), pp. 386-400, 2011.

[36] M. Welsh, D. Culler, and E. Brewer, "SEDA: An Architecture for Well-Conditioned, Scalable Internet Services," ACM SIGOPS Operating Systems Review, vol. 35, no. 5, pp. 230-243, 2001.

Published

2024-03-30

Issue

Section

Articles

How to Cite

1.
Pattan AK, Gangishetti S. Progressive Fault Tolerance in API-First Front-End Applications: Architecture, Patterns, and Evaluation. IJAIDSML [Internet]. 2024 Mar. 30 [cited 2026 May 30];5(1):253-61. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/568