Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-1255 Adding Raw Packet export capability as Packet Capture Agent(PCA) #113

Merged
merged 2 commits into from
Sep 22, 2023

Conversation

shach33
Copy link
Contributor

@shach33 shach33 commented Apr 19, 2023

This PR extends the agent to export raw packets through the network. We call it Packet Capture Agent(PCA).

This functionality is turned off by default. I have added 2 new environments to enable this.
set ENABLE_PCA=true to activate packet cpature.
set PCA_FILTER=[ipproto,port number] to indicate the filter for packet capture, e.g. export PCA_FILTER=tcp,22

The PCA_FILTER defaults to udp DNS(udp,53). If ENABLE_PCA is true, but filters are not specified, by default it captures DNS packets.

The captured packets are written to a socket in PCAP format.

@eranra
Copy link
Collaborator

eranra commented Apr 20, 2023

@msherif1234, for awareness --- the idea behind this PR is to become the basis for both PANO (of course) but also for other Packet level network observability requirements, such as "selective" packet mirroring - functionality such as network tracing etc. So generalization and customization of the content are key ( hope this make sense to you )

@codecov
Copy link

codecov bot commented Apr 24, 2023

Codecov Report

Patch coverage: 3.03% and project coverage change: -7.41% ⚠️

Comparison is base (7d31bf6) 39.56% compared to head (1f17d37) 32.16%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #113      +/-   ##
==========================================
- Coverage   39.56%   32.16%   -7.41%     
==========================================
  Files          31       36       +5     
  Lines        2391     2960     +569     
==========================================
+ Hits          946      952       +6     
- Misses       1388     1951     +563     
  Partials       57       57              
Flag Coverage Δ
unittests 32.16% <3.03%> (-7.41%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
pkg/agent/packets_agent.go 0.00% <0.00%> (ø)
pkg/ebpf/tracer.go 0.00% <0.00%> (ø)
pkg/exporter/packets.go 0.00% <0.00%> (ø)
pkg/flow/packet_record.go 0.00% <0.00%> (ø)
pkg/flow/perfbuffer.go 0.00% <0.00%> (ø)
pkg/flow/tracer_perf.go 0.00% <0.00%> (ø)
pkg/agent/agent.go 39.11% <58.06%> (+1.26%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@msherif1234 msherif1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took initial look
I see ur created a parallel agent instead of extending the current one after u check the new config knob can u explain the reasoning ?

bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
pkg/agent/ip.go Outdated Show resolved Hide resolved
@shach33
Copy link
Contributor Author

shach33 commented May 16, 2023

I took initial look I see ur created a parallel agent instead of extending the current one after u check the new config knob can u explain the reasoning ?

The current agent works at the level of flows and flow metrics. With PANO, we wish to use a 'packet' as the unit of communication. The struct elements of flow metrics are no longer required. Hope this helps.

@msherif1234
Copy link
Contributor

I took initial look I see ur created a parallel agent instead of extending the current one after u check the new config knob can u explain the reasoning ?

The current agent works at the level of flows and flow metrics. With PANO, we wish to use a 'packet' as the unit of communication. The struct elements of flow metrics are no longer required. Hope this helps.

yeah I am aware of that as u know I went through the same trouble when I prototyped TCPlife span hook but was able to refactor the current agent so it can handle both probably something we need to poll the team on @jotak @eranra @praveingk wdyt ? will be great if we somehow abstract support to perf map as we have other use cases as well that wanted to use it like the one I mentioned above and more will come and will be nice to share this so all hooks need to generate perf event will reuse

@memodi
Copy link
Contributor

memodi commented May 17, 2023

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 17, 2023
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label May 18, 2023
@dushyantbehl dushyantbehl self-requested a review May 23, 2023 17:52
@shach33 shach33 changed the title [WIP] Adding Raw Packet export capability for PANO Adding Raw Packet export capability for PANO May 24, 2023
Copy link
Contributor

@dushyantbehl dushyantbehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shach33 can you also remove the temp1.pcap file which I assume got pushed along with your commits.

bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
pkg/ebpf/tracer.go Outdated Show resolved Hide resolved
pkg/ebpf/tracer.go Outdated Show resolved Hide resolved
pkg/exporter/ipfix.go Outdated Show resolved Hide resolved
}

// ReadPacketFrom reads a PacketRecord from a binary source, in LittleEndian order
func ReadRawPacketFrom(reader io.Reader) (*PacketRecord, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename this to what it does ReadRawPacketFromBuffer or just ReadRawPacket

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to ReadRawPacket

bpf/flows.c Outdated Show resolved Hide resolved
@dushyantbehl
Copy link
Contributor

I belive yes we can extract out common code and enable one or both functionalities in the agents rather than extending the code which can create code duplication.

@shach33
Copy link
Contributor Author

shach33 commented May 31, 2023

Performed first level refactoring.

bpf/flows.c Outdated Show resolved Hide resolved
bpf/flow.h Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
pkg/agent/agent.go Outdated Show resolved Hide resolved
@shach33 shach33 changed the title Adding Raw Packet export capability for PANO NETOBSERV-349 Adding Raw Packet export capability as Packet Capture Agent(PCA) Jul 18, 2023
@dushyantbehl
Copy link
Contributor

dushyantbehl commented Jul 19, 2023

This PR needs a code review. Waiting for @shach33 to commit final version before providing the review

.gitignore Outdated
@@ -4,3 +4,5 @@
bin/
e2e-logs
ebpf-agent.tar
*.o
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unclear to me why we would ignore .o files, I think we shouldn't?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe its a good practice to not include object files in commits. We cant get their diffs, they always have merge problems. But if thats how we do things. I will revert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if err != nil {
logrus.WithError(err).Fatal("can't instantiate NetObserv eBPF Agent")
}
if config.EnablePCA {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this means the agent is started either for flow logs or for PCA, but cannot do both at the same time, right?
Do you see a technical limitation for doing so, or is that just a choice and we could change that any time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The flows and packets agent are disparate and only one is active at a time. The metric of transfer in flows agent is a flow struct, while with PCA is a packet. We could merge them but that would add multiple unnecessary if checks at every step.

If the team deems this to be a better way to go, I can later create another PR with the merge. As of now, the code is cleanly separated between packet capture and flows agent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think if we are going to keep agent either PCA or Flow logs we should specify that make a single environment like AGENT_TYPE which can be PCA or FlowLogs

Copy link
Contributor Author

@shach33 shach33 Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can discuss this in our calls and make a decision then. In this PR, I would suggest to let it stay the way it is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@jotak
Copy link
Member

jotak commented Aug 23, 2023

Performed Resource analysis on the PR with 6 different use cases.

1. base netobserv:main

2. PR with PCA disabled

3. PR with PCA enabled - invalid filter

4. PR with PCA enabled - filter tcp,80

5. PR with PCA enabled - filter udp,53

6. PR with PCA enabled - filter tcp,443

@shach33 if I understand correctly your perf measurements, the current ebpf agent (without your PR) consumes twice as much memory than with your PR with PCA disabled? (1=~280MB vs 2=~140MB). This would be good news but is quite unexpected, how can we explain that? Or it could be a wrong measurement?

@jotak
Copy link
Member

jotak commented Aug 23, 2023

For info, I did some tests on my cluster for memory usage: there's indeed an increase with this feature (even when disabled) , I guess essentially due to the map. On my cluster, that's around 10MB increase per pod. Since it is constant, the more loaded is the agent, the more negligible is this overhead.

@shach33
Copy link
Contributor Author

shach33 commented Aug 24, 2023

Kernel does not support BPF_F_NO_PREALLOC for LRU type maps.
cilium/ebpf#1072

//PerfEvent Array for Packet Payloads
struct {
__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
__type(key, int);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use u32

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

__type(key, __u32);
__type(value, __u32);
} packet_record SEC(".maps");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BPF_F_NO_PREALLOC not avaliable for all map types

bpf/pca.h Outdated
// Set flag's upper 32 bits with the size of the paylaod and the bpf_perf_event_output will
// attach the specified amount of bytes from packet to the perf event
// https://github.com/xdp-project/xdp-tutorial/tree/9b25f0a039179aca1f66cba5492744d9f09662c1/tracing04-xdp-tcpdump
flags |= (__u64)packetSize << 32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't there a limit of how much hdr we can send with perf event ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems there is no enforced limit by the kernel and if userspace can't hold it the event will be dropped

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the size is too big, the call to bpf_perf_event_output will fail and return -EFAULT.

meta.if_index = skb->ifindex;
meta.pkt_len = packetSize;
meta.timestamp = current_time;
if (bpf_perf_event_output(skb, &packet_record, flags, &meta, sizeof(meta))){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the point of checking the return code if either way u return ok ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was an oversight. Fixed it.

}
sourcePort = udp_header->source;
destPort = udp_header->dest;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not right , we can't assume if its not TCP its UDP, we have SCTP, ICMPv4 and ICMPv6 protocols support

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Only checks for tcp/udp, for anything else, we return.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sctp will work very similar to ycp and udp unless we see no use case for it ? which isn't true because svc can use sctp protocol taht is why we added it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added SCTP. ICMP packets dont really have port associated to them. So, it would require an update to the filter specifications, which I can do as part of an extension, but may be not with the current PR.

bpf/pca.h Outdated
}

if (validate_pca_filter(ip->protocol, (void *)ip + sizeof(*ip), data_end )){
return attach_packet_payload(data, data_end, skb, current_time);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: u don't use current_time here so u can read it in attach_packet_payload and remove it from here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Fixed.

bpf/pca.h Outdated
return export_packet_payload(skb);
}

#endif /* __PCA_H__ */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit pls add line to the end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

bpf/pca.h Outdated
}

// Only IPv4 and IPv6 packets captured
if (eth->h_proto != bpf_htons(ETH_P_IP) && (eth->h_proto != bpf_htons(ETH_P_IPV6))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be more efficient instruction wise to do

u16 ethType = bpf_ntohs(eth->h_proto);
if (ethType != ETH_P_IP && ethType != ETH_P_IPV6) {
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

canceler()
}()
return ctx
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did u add this ? for standalone testing ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to reuse the same code block for terminating both netobserv and PCA.

}
if config.EnablePCA {
if config.PCAFilters == "" {
logrus.Fatal("[PCA] NetObserv eBPF Agent instantiated without filters to identify packets. Please set environment variable PCA_FILTER.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we treat that as capture everything instead of error ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we treat this for capture everything? That might be extremely resource intensive. This was a design decision we made. This could be updated if the team deems this behavior correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wanted to share one of the PM customers asked use case is to do capture all (just learnt this now) cc'd @jpinsonneau

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 it seems to be a use case now 🙃
Would it work if we capture everything expect our own netobserv traffic ? Let's say with unlimited resources

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to collect and export all packets if the filter (enviroment variable PCA_FILTER) is not set at all.
However if the filter is wrongly set, incorrect format/ non-supported protocols etc, then the agent doesnt export anything.

Copy link
Member

@jotak jotak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shach33 : here's a suggestion as discussed on slack. This is to minimize the memory allocation when PCA isn't used.

@@ -117,12 +124,19 @@ func NewFlowFetcher(cfg *FlowFetcherConfig) (*FlowFetcher, error) {
return nil, fmt.Errorf("rewriting BPF constants definition: %w", err)
}

//Deleting specs for PCA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//Deleting specs for PCA
//Deleting specs for PCA
// Always set pcaRecordsMap to the minimum in FlowFetcher - PCA and Flow Fetcher are mutually exclusive.
spec.Maps[pcaRecordsMap].MaxEntries = 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure Joel.

pkg/ebpf/tracer.go Show resolved Hide resolved
if (packetSize < skb->len){
packetSize = skb->len;
if (bpf_skb_pull_data(skb, skb->len)){
return TC_ACT_UNSPEC;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we come to this case for TCP DNS have u tried to capture DNS over TCP and measured the CPU usage ? u can simply do dig google.om +tcp to do DNS over TCP

Copy link
Contributor Author

@shach33 shach33 Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what is happening here, but when I experiment, the requests/response generated via dig google.com +tcp are also UDP packets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no that is indeed tcp dns I have used it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, I am able to capture those.

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 25, 2023
bpf/pca.h Outdated
u32 packetSize = (__u32)(data_end-data);

// Record the current time.
__u64 current_time = bpf_ktime_get_ns();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we don't use __types in the agent code so probably need to stick to the same style

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

if (packetSize < skb->len){
packetSize = skb->len;
if (bpf_skb_pull_data(skb, skb->len)){
return TC_ACT_UNSPEC;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no that is indeed tcp dns I have used it

bpf/pca.h Outdated
// Set flag's upper 32 bits with the size of the paylaod and the bpf_perf_event_output will
// attach the specified amount of bytes from packet to the perf event
// https://github.com/xdp-project/xdp-tutorial/tree/9b25f0a039179aca1f66cba5492744d9f09662c1/tracing04-xdp-tcpdump
flags |= (__u64)packetSize << 32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems there is no enforced limit by the kernel and if userspace can't hold it the event will be dropped

}
sourcePort = udp_header->source;
destPort = udp_header->dest;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sctp will work very similar to ycp and udp unless we see no use case for it ? which isn't true because svc can use sctp protocol taht is why we added it

io.Closer
Register(iface ifaces.Interface) error

LookupAndDeleteMap() map[int][]*byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you aren't using this interface ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am. This is the one being used to delete map.

getLen := make([]byte, 4)
packetTimestamp := make([]byte, 8)
// Read IfIndex and discard it: To be used in other usecases
_ = binary.Read(reader, binary.LittleEndian, make([]byte, 4))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to show interface name with the collected flow, u already have if_index in thr perf hdr

iface, err := net.InterfaceByIndex(int(eventHdr.IfId))
ifcace.Name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PCA eventually only transmits a packet stream. How do you think we should include the interface information?

logrus.WithField("packets", len(evictingEntries)).
Debug("evicting packets from userspace accounter after reaching cache max length")
c.evict(evictingEntries, out)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u need the above logs won't some of this will flood the logs ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a debug log. But removed it for now.

jotak
jotak previously approved these changes Aug 29, 2023
Copy link
Member

@jotak jotak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @shach33 , LGTM on my side, I'm leaving it in @msherif1234 / @dushyantbehl hands! :)

destPort = sctp_header->dest;
}
else
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please add {} here with else to keep the code structure same across all C code.

@shach33

bpf/pca.h Outdated
}
else
return false;
if (sourcePort == bpf_htons(pca_port) || destPort == bpf_htons(pca_port)){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shach33 nit: you can possibly do bpf_htons(pca_port) once.

Copy link
Contributor

@dushyantbehl dushyantbehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 2 minor things for @shach33.
Let me know when they are done and I can add a lgtm here.

Leaving for @msherif1234 for taking a final view

@jpinsonneau
Copy link
Contributor

@shach33 can we just rebase this ? then we are good to merge 😸
thanks !

@msherif1234
Copy link
Contributor

@shach33 why the changes in main is showing up in ur PR as new diffs ?

@shach33
Copy link
Contributor Author

shach33 commented Sep 22, 2023

@msherif1234 This is fixed now.

@msherif1234
Copy link
Contributor

/lgtm
/approve

@openshift-ci
Copy link

openshift-ci bot commented Sep 22, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msherif1234

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 4297938 into netobserv:main Sep 22, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm no-doc This PR doesn't require documentation change on the NetObserv operator no-qe This PR doesn't necessitate QE approval
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants