First things first! The below article is NOT written by me, and I have no relationship to that article (other than sharing the same pain). However, it totally falls in line what I and our customers are observing and if they don't see it as a problem now, they will see it as a problem in a couple of years.
What is the problem? Using ACLs to segment your traffic in a data center is painful. For many reasons. You write it in a language that is great for networkers, but you need to get the info for writing that piece from application guys. You write policy on devices that can do L3/L4 policy. But they have been built to really handle L2 traffic. You make one mistake in that policy and your switch, router or firewall will explode (not literally), but it might open up your datacenter to traffic that you never wanted to allow.
Another obvious fact the article mentions is responsibility. Yes, who on earth is responsible for rules between workloads, between applications? The author states that he never started a job, where the application team owned the firewall rules for the application, and i have never seen it in organisations big or small either. There is a huge disconnect between the application teams, the network teams and security teams. Not only on how to write the policy for this application (Application team: please open anything so this works, Security team: please close almost anything so it's secure, Network team: not going to happen on our infrastructure, it coud break). There is no common knowledge on connections, of course there is no real-time Application Dependency Map showing you all the traffic building a common understanding between the three teams.
The third problem the author states is well known too: Firewall/ACLs grow. Daily. Weekly, Monthly, Yearly. After three years they outgrow the TCAM space or are unreadable (doesn't really take that long). The firewall vendor says you should buy a bigger firewall. Nobody will ever delete a rule, because it's not clear what the impact would be. If it's not a pure allow-list approach, order will matter and no sane person will ever touch rule order again. Ever. You do a change, and the ruleset can break. Angry people will call you. You could be fired. For a firewall or ACL change. What do you do, you have a family or a house to pay off: You avoid any risk.
I couldn't agree more and the author gives some great advice to people maintaining ACLs and firewall policy, like naming things, creating groups, having a naming and labelling structure.
He also suggests to get the application owners on board, to outsource policy to them, because they know (or are supposed to know) their applications.
Deny by default, permit by exception. Call it zero trust, or just call it allow-list or call it "this will make my life easier in the long term". Don't go for deny-lists. Ever.
What that thing DOES is pretty clear. You can issue concise commands and get direct information about the operation of that device. Add policy enforcement to the mix and whoa there Bessy, you done fucked up. Firewall configurations within the data center can (and will) quickly grow to hundreds of thousands of lines.