Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the purpose of 'values' for networks in the definition files #236

Open
csiebert007 opened this issue Feb 3, 2023 · 13 comments
Open

Comments

@csiebert007
Copy link

Hi!

IMHO it would make reading and editing the definition files much easier if the intermediate key 'values' would be omitted with the networks definition - see suggested example below.
Or is there any reason, e.g. another key that can be used instead of or in parallel to 'values' for networks?
So far I could no find any alternate key in the docu.
If there is no other such key, I'd omit this 'values' key and use the same structure for networks as it is for services.

Many thanks in advance!
Ciao,
Christoph

Suggested definition YAML for networks (as already for services):

networks:
RFC1918:
- address: 10.0.0.0/8
- address: 172.16.0.0/12
- address: 192.168.0.0/16
WEB_SERVERS:
- address: 10.0.0.1/32
comment: Web Server 1
- address: 10.0.0.2/32
comment: Web Server 2
MAIL_SERVERS:
- address: 10.0.0.3/32
comment: Mail Server 1
- address: 10.0.0.4/32
comment: Mail Server 2
ALL_SERVERS:
- WEB_SERVERS
- MAIL_SERVERS

services:
HTTP:
- protocol: tcp
port: 80
HTTPS:
- protocol: tcp
port: 443
WEB:
- HTTP
- HTTPS
HIGH_PORTS:
- port: 1024-65535
protocol: tcp
- port: 1024-65535
protocol: udp

@fischa
Copy link
Collaborator

fischa commented Feb 3, 2023

Hi @csiebert007,
if I recall correctly there are plans (#1) for adding other options on how to "fill" definitions e.g. via a lookup in an IPAM. I think that's where the values is coming from.

Regards,
Axel

@ankenyr
Copy link
Collaborator

ankenyr commented Feb 4, 2023

Going to mark this closed but feel free to continue the discussion.

@ankenyr ankenyr closed this as completed Feb 4, 2023
@csiebert007
Copy link
Author

Hi!

I still do not understand how this key 'values' in the network definition will help if there is a lookup to an IPAM. At some point the definitions from IPAM need to be merged with the definitions in the YAML file. Why would that be on a different level?

Also why is not something like 'values' foreseen for the services definitions? E.g. to import /etc/services or from a management system via API?

Best regards,
Christoph

@ankenyr
Copy link
Collaborator

ankenyr commented Feb 5, 2023

/etc/services I would not consider to be a source of truth for the network as a whole. Do any of the IPAM solutions such as nautobot or bluecats provide port information?

The ability to integrate with IPAMs via plugins was something we wanted to do before launching but we decided to cut it for launch.

Personally the way I have always used the project and the way I see the IPAM integration going is to have a separate process (or subcommand when we consolidate things into one command) that can either update one definition or update the entire file which is separate from the rendering command. To give a simple example

Changed IPs of SERVER_FOO in IPAM and wish to update the ACLS

aclgen update_defs --name SERVER_FOO
aclgen render --policy_file protect_foo.pol

Weekly update everything

aclgen update_defs
aclgen render

I will go ahead and reopen this in case you have more followups and close it if I do not hear a response in a few days.

I would love to hear feedback from anyone else on their thoughts of IPAM integration either here or in #1

@ankenyr ankenyr reopened this Feb 5, 2023
@csiebert007
Copy link
Author

Hi!

Thanks for your answer!
If I'd use any integration with an IPAM, I'd use it together with the Aerleon API and not with aclgen. Then I'd expect the network entries to end up at the same place as if I would read it from a YAML definition file. Therefore I had this question why this 'values' key is used. Also I'd expect that these entries could be merged between a network definitions file and the IPAM values. What is open is which one of the two would take higher precedence, but maybe that can be configurable.

Similar to the network entries from an IPAM, the services entries could be read out from an NMS or a NetFlow management application, where this might already exist. W.r.t. the /etc/services, I would not have Aerleon read it automatically but only if I trust it, because I have it under control. But of course I could write my own parser and create a YAML file, so it's not a big deal.

Best regards,
Christoph

Best regards,
Christoph

@ankenyr
Copy link
Collaborator

ankenyr commented Feb 7, 2023

No worries Christoph!

I really appreciate your feedback.

So I think we need to understand if you are desiring to feed in IPs you have resolved yourself or wish to use the plugins for IPAMs that will eventually be allowed within the core of Aerleon.

How I think you want to do things are as follows

  1. Get IPs and Services to construct the definitions
  2. Feed that and a policy file into the API

and thus are confused why a value key is necessary?

They way I envision this working is the plugins live behind the API so you would feed in a definition file that could look like

networks:
  MYSERVER:
   netbox:
      tag: myserver
  ANOTHER_SERVER:
    values:
        - address: 192.168.1.1/32
services:
  SSH:
    - port: 22
      protocol: tcp

In the case above, once you fed that into the API, the core Aerleon code would reach out to the configured netbox server, look for IPs tagged with myserver and use those.

What happens with the values then? That goes back to my previous talk about aclgen and how I see it being used. I could certainly use feedback on this though.

One of the issues is when using aclgen in a large network with lots of polices, deffinitions, and automation around them is making manageable changes.

Lets take a situation where I have the plugin automation working and above netbox would automatically fill in the IPs from the IPAM when I run aclgen. I have one policy that includes the following terms

    terms:
      - name: allow-myserver
        destination-address: MYSERVER
        action: accept
      - name: allow-other-servers
        destination-address: ANOTHER_SERVER
        action: accept

Imagine I add to allow-other-servers a port and then run aclgen. I would want to only see changes for what I updated. If the plugin for netbox runs and returns a different set of IPs though, there will be a change in allow-myserver

In this example it would be easy to notice and probably verify, but what if you have a policy with a lot of terms? What if you are updating one static definition that covers multiple policy files? You could have every single policy updating with a cascade of changes. It would be hard for anyone reviewing your change to determine what you changed and if it was appropriate.

This is what the update_defs workflow is for. When I was doing heavy work with Capirca we did a full definitions update at the beginning of the week without changes to any policy files. This was accepted as a large change but it should only be updating IP addresses. Changes to terms themselves happened without updating the IPs or by updating only the specific definitions needed.

We did not have a fully fledged API at that time like Aerleon has now so maybe we can come up with something better. We have a small community in our slack if you would want to join and have quicker more ad-hoc discussions about this. I will also be posting this issue in the slack channel asking people to give some thought to this.

@csiebert007
Copy link
Author

Hi!

Many thanks for your answer.
I'm still not sure if I understand the syntax with the 'values' key correctly.
If it is about the source where the IP address comes from, I would have used the following structure:

networks:
MYSERVER:
- source: netbox
tag: myserver
ANOTHER_SERVER:
- address: 192.168.1.1/32

If 'address' is not available, look for 'source' and that gives a hint from where (e.g. IPAM, SSoT, NMS, trusted local DNS server, ...) to do the lookup for a trusted IP address.
Then in the networks dictionary the 'address' parameter can be updated due to that lookup and all properties (including where it came from) are still on the same network object.

So in the Python networks dictionary the content would look like this after the lookup (for simplicity in YAML syntax):
networks:
MYSERVER:
- source: netbox
tag: myserver
address: 172.16.1.1/24
ANOTHER_SERVER:
- address: 192.168.1.1/32

But overall, in my case I would more rely on a workflow that I have under control.
IMHO, the lookup support for netbox should not be in Aerleon, because besides netbox I could envision many other sources that could be relevant from where trusted network/host parameters can be retrieved (nautobot, trusted local DNS server, commercial IPAM system, NMS, ...).
The lookup will still be necessary in many use cases, but IMHO this can easily be scripted and adapted by use of the various REST-APIs provided in these external solutions.
Also adding support for netbox lookup would create Python module dependencies, which are not necessary in many other cases.
IMHO Aerleon should focus on the generation of ACLs for many platforms - maybe there are more to add?

But of course it could help Python coders to give examples on how to retrieve network/host parameters from netbox and save it into networks definitions ...

Sorry for this very personal point of view ...

Best regards,
Christoph

@ankenyr
Copy link
Collaborator

ankenyr commented Feb 10, 2023

I want to hear your point of view because it sounds like you are very knowledgeable and passionate. This project needs people like that. So thank you and please keep engaged!

I believe we are actually close to the same idea. A definition would have two things in it, the current values for use in running aclgen and if available, information on where to pull new data. IPs that never change such as RFC1918 would never have a section instructing where to pull new data from.

The place we might diverge is when and how aclgen would pull new IPs. It sounds like you want to have more control over the management of the IPs and you will still have that. You could build automation yourself to add and remove IPs from the definitions.yml file and never include the optional section that tells where to pull new IPs from.

When using the API to take a policy file and definitions and render it to the resulting files the data to tell where to get new IPs would be ignored. IPs should not be updated on the fly dynamically across the entire policy file without some manner of control. A different API or tool would be used to either update one or more or all definitions by itself as a separate operation from rendering the policy file.

The ability to have different sources to update your definitions will be a plugin model. We will define the interfaces and will likely support a few. We would encourage others to develop them also. We have seen demand for this and enabling this as a plugin where the code does not need to be contributed to us, nor the core code forked and modified means people can build their own and easily support their integration.

There are always more platforms to add, the next one we will be working on is Fortinet since it seems they won't be contributing the one they tried to provide to Capirca here.

@csiebert007
Copy link
Author

Many thanks for your feedback.
Most likely I'll not use the aclgen tool, but instead the API, because it gives me the flexibility for the workflows and the integration with other functionality that I need.
Having a plugin model for different sources sounds very promising.
Also I'm very happy to hear that more platforms for ACL generation will be supported.
What about support for BSD (OpenBSD/FreeBSD/NetBSD) based on its pfctl/pf tools?

@nemith
Copy link
Contributor

nemith commented Feb 11, 2023

I found this weird as well and i disliked the non-symmetry with services.

I am not sure dynamic sources should strive to be driven inside the yaml files but instead just be added via code. There is nothng wrong with making aerleon more driven as a library for generating acls than just a cli tool. (i.e dyanamic definitions from netbox or any other source can just be added though code vs generated from a cli)

You get into a weird area with trying to support external sources via yaml files where there is going to be a thousand corner cases. Don't try to abstract it with yaml, provide first class support from code.

@ankenyr
Copy link
Collaborator

ankenyr commented Feb 12, 2023

@csiebert007 Totally reasonable and encouraged. We want to encourage the abilities for both large companies integrating into complex pipelines or the small one person network engineer serving a startup. Having an off the shelf solution ready for the later is what aclgen is for and a good API is helpful for aclgen as well as the large complex organizations that would use Aerleon. In my mind aclgen is a user of the Aerleon API same as you and @nemith would be.

I believe we already support packetfilter but it may not be fully featured since I have never given it a run.
https://github.com/aerleon/aerleon/blob/main/aerleon/lib/packetfilter.py
Give it a try and let us know!

@nemith We do wish to provide a CLI tool that is "off the shelf". You seem to be suggesting that there is some API interface that may be missing currently that you think could be used instead? Could you describe what you are expecting? From my understanding you would want to write code to pull information from an IPAM, generate the correct definitions and send them with a policy to the existing API interface. I don't see where Aerleon fits in to the IPAM interfacing part since you would likely be using their own libraries in your own code. I feel like I am missing something.

As for the aesthetics of the yaml file as it currently is, we could make it so if the yaml spec supports just providing the addresses as a list at the same time as supporting the current method.

@nemith
Copy link
Contributor

nemith commented Feb 13, 2023

I guess my point is this does mean your configuration has a tight dependency on external sources which is going to increase the complexity and mean adding in a lot of exceptions.

Such dependencies are much easier to reason and implement in code than trying to create configuration in yaml for them.

Some completely made up example

import aerleon

def main():
     definitions = aerleon.parse_defs("./def")
     
     prefixes = netbox.query_ipam(... some very specific and performant query...)
     for prefix in prefixes:
          # ...additional filter logic 
       
        definitions.add_network( 
            Network(name="NB_" + prefix.name.upper(),  value=ipaddress.IPaddress(prefix.address))
        )
     print(areleon.render(defs, policy))

This is here to showcase a idealized way of defining policies via code and not yaml. With code being a first class citizen and not take a back seat

  1. Queries for netbox are customized and specific. This gets out of the problem of libraries like nornir that fetch the world and filter later.
  2. netbox is an example but and end user can use any system. You could even use googlesheets apis and pull in data that way. The flexibility is driven by code and doesn't need aerleon developers to add abstractions to the yaml files. Yes you can make this pluggable but then that just adds more complexity to a really simple problem
  3. To do this in yaml you would have to query for everything in netbox that could be relevant and then recreate each object in yaml duplicating a lot of work.
  4. Sidenote: The api needs to be cleaned up to allow for ease of use.

I have already ran into many instances where yaml just isn't going to work without adding a template layer on top (like how ansible handles playbooks) where you want things like loops to expand our many terms, etc.

Yaml is great for simple cases, but I am really afraid of how complex it will get trying to handle "everything". You start trying to turn yaml into a scripting language and yaml is bad at it (despite the success of ansible).

@ankenyr
Copy link
Collaborator

ankenyr commented Feb 27, 2023

Netbox has an API and it supports tagging of IPs.
Our YAML config could have a spec like

networks:
  MYSERVER:
   netbox:
      tag: myserver

This would query the netbox with the following https://demo.netbox.dev/api/ipam/ip-addresses/?tag=myserver'

It would not load everything. I am confused as to why you think it would.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants