File Integrity Monitoring | User Information - Linux #7401

narph · 2023-08-16T08:50:55Z

Similar to Auditbeat's FIM module, our new FIM integration can monitor for file changes, but does not include the user information to capture who modified/accessed the file. This is a significant visibility gap for security analysts and a heavily requested enhancement request.

Research needs to be done to determine how we can capture user information within our FIM integration and any underlying changes required. Can the OS components we rely on today be leveraged or is an underlying change to how we gather FIM data needed?

Meta issue #3310

elasticmachine · 2023-08-16T08:51:06Z

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

chemamartinez · 2023-09-06T14:30:11Z

I made a first research about the alternatives that I found to get user data for Linux.

Fsnotify

https://github.com/fsnotify/fsnotify

It is currently used by our FIM module for recursivity. Unfortunately, it doesn't support getting user information for now, as is explained in this recent issue.

They are trying to add support for fanotify but it is not ready yet, and seems to be stalled: fsnotify/fsnotify#542

Seems that it is not a valid option for now.

Fanotify

https://github.com/torvalds/linux/tree/master/fs/notify/fanotify

Fanotify is a file access notification system built into the Linux kernel. It's designed to supersede inotify in some use cases, especially those related to FIM, by providing more detailed information and allowing for responses (like blocking actions).

In 2017, fanotify was very limited:

Compared with inotify, fanotify's assortment of events might feel limited. At present, creating, deleting, and removing events are not supported: You can watch files and directories being opened, accessed, and closed, and that's it. Moreover, mmap() generates no events. Fanotify isn't an inotify replacement; instead, it focuses on cases such as malware scanning and hierarchical storage management.

Source: https://www.linux-magazine.com/Issues/2017/194/Core-Technologies

Support for missing events was added for Linux 5.1, which was released in 2019.
https://man7.org/linux/man-pages/man7/fanotify.7.html
https://kernelnewbies.org/Linux_5.1#Improved_fanotify_for_better_file_system_monitorization

Fanotify seems to be a valid solution to get the user information, it allows to enable a flag FAN_REPORT_PIDFD to get the PID that made the change along with other metadata.

We already have some previous work to include fanotify in FIM, and also there are some external projects that could be interesting to explore such as https://github.com/opcoder0/fanotify. However, it seems that fanotify can lead to reliability issues based on past experiences.

Auditd

https://man7.org/linux/man-pages/man8/auditd.8.html

The Audit daemon is another option since it provides the user information about any change in the monitored file systems. It is also used by other FIM solutions like Wazuh or LogRhythm.

The main problem of using Auditd is that it is a dependency that must be installed in the host to be monitored, so we should be responsible of the Auditd status and also load the rules, that may sometimes conflict with rules already configured.

eBPF

https://ebpf.io

eBPF is a more recent and very powerful technology in the Linux kernel that can be used for a variety of monitoring, tracing, and networking tasks, including file integrity monitoring. eBPF allows users to run custom programs in the kernel space safely, without modifying the kernel source code or loading additional modules. When it comes to file integrity monitoring, eBPF can be employed to trace specific syscalls related to file operations and collect relevant data.

eBPF provides some considerable benefits:

Performance: lightweight and efficient, minimal overhead.
Flexibility: they can be attached, modified, or detached from the kernel at runtime without any kernel restarts.
Granularity: you can collect just what you need (specific syscalls, user IDs, paths, etc.)

It also involves more challenges compared to other solutions:

Learning curve
While there are some libraries that make it easier to handle eBPF programs from Go, the actual eBPF code is typically still written in a restricted subset of C. There isn't a way to write eBPF programs directly in Go. The Go runtime has features and behaviors that are not compatible with the constraints of eBPF programs.

The usual approach I think is to write the eBPF program in C, compile it, and then use one of the available Go libraries (or a new one created by us) to load, attach, and manage the eBPF program in the kernel from a Go application.

Kprobes

https://docs.kernel.org/trace/kprobes.html#

In a lower level than the other solutions, kprobes is a kernel feature that allows instrumenting the kernel by setting breakpoints and catch system calls, so it could be useful for FIM by catching file-oriented calls such as open(), write() or unlink().

Some considerations about implementing kprobes for FIM:

Mainly, I see similar advantages as for eBPF, regarding efficiency and flexibility.
On the other hand, working directly in the kernel space could be dangerous since any failure may affect the whole system, in terms of stability and also security. Therefore, this solution should involves more complexity in every aspect.

A deeper analysis would be needed to determine if this is a feasible path.

norrietaylor · 2023-09-07T18:32:26Z

Hello @chemamartinez 👋

The Linux Platform team discussed this issue in a recent team meeting. Our discussion produced some additional information that you may find interesting and valuable.

First, the crux of this issue is that we need to associate a file operation with a calling process. In Linux, the process is the entity that is operating on a file. As such, it is also the entity that is associated with an effective user. If you can determine the pid of the process, it becomes straightforward to get other helpful metadata, including that of the user.

In our discussion, we focussed on three of the options you listed above for retrieving process information: fanotify, kprobes, and eBPF.

Fanotify

We do not recommend leveraging fanotify for this metadata. The reasons for this are primarily related to reliability.

Fanotiy can operate in notification or permission mode. In notification mode, the application would receive the pid but would need to scrape the /proc filesystem after an event for user metadata. This approach would introduce a race condition for short-lived processes in which the required metadata would be absent. To solve this problem, we could receive an event in permission mode. Auditbeat would hold a lock on the file while we scraped /proc filesystem for the metadata in this situation. This technique is inherently a system stability risk as if, for some reason, Auditbeat is killed while it holds this lock, the entire filesystem can be blocked. Locking a production Linux host's file system can be a source of severe SDH issues.

eBPF

eBPF is a great candidate as it can instrument a kernel event in real-time and transmit a safe event to userspace containing much of the information we need without worrying about scraping /proc.

While eBPF requires some C knowledge to write programs, this should be manageable and not involve a significant learning curve. Most eBPF tracing programs are relatively simple and easy to understand. In fact, many of the existing probes in https://github.com/elastic/ebpf could be used as they are.

There are some limitations to eBPF that we should highlight. First, there is an instruction limit for eBPF programs, which means complex logic can be challenging to implement. Algorithms that involve elaborate filtering or string parsing are often better handled in userspace. Second, eBPF can be problematic to work with for older Linux kernels. Our team prefers BTF and bpf ring buffer support, which were introduced in 5.4 and 5.8. Practically, this means 5.10 kernels and newer are suited for eBPF tracing and telemetry.

Kprobes

Kprobes are also a great candidate as they can transmit a safe event to userspace in real-time. Kprobes can be set up to use tracefs, meaning you are not operating directly in kernel space. This mitigates any stability concerns you have raised.

When using tracefs you are bound to predefined format structures, which limits the data points you can instrument. They are also not a formal API, which means they can change as the kernel is updated. This limitation means kprobes and tracefs are less flexible than eBPF.

The advantage of this approach is that much older kernels are supported.

Recommendation

Our general recommendation would be to enable Auditbeat to make a runtime decision based on the kernel version to use a kprobe or eBPF implementation. Older kernels would use kprobes, and newer kernels would use eBPF. These two implementations could be developed serially, prioritizing kernel support that would satisfy the most customers.

Another request to be aware of is that of powering Session View with Auditbeat events. We are also discussing enriching Auditbeat events with process-oriented metadata for this task. Ideally, the same backend could serve both goals and be reusable for future requests.

andrewkroh · 2023-10-11T15:37:01Z

We have not decided on technology at this point. In order to have a better understanding of what data is required I'm listing the current event triggers and the data that is being reported. We need to support amd64 and arm64 with this.

Event Triggers

File or dir created (e.g. open, mkdir, symlink, link)
File or dir renamed
File or dir deleted
File modified (written, truncated)
Attributes modified (e.g chmod, chown, timestamps, and extended attributes)
(fyi) Also at startup auditbeat can scan the filesystem to look for deltas since it last ran.

Data

Apart from the inotify trigger, all of the data about the files is collected from userspace. For example, when it gets an inotify IN_ATTRIB event it will stat and getxattr to see what changed. We don't have to modify how this part works, but if there is an opportunity for the events to include specific information about what changed then could make FIM more efficient.

Auditbeat is reporting this file metadata with events:

path
target_path (for symlinks)
inode
owner uid
owner gid
size
mtime
ctime
type (dir, file, symlink, etc)
mode
xattrs (specifically security.selinux and system.posix_acl_access)
file content hashes

We want to add information about the process and user that triggered the file change event. I was thinking to add some minimal metadata that gives you some context, and enables you to pivot to process events via entity_id when you need more rich process data (this assumes that the user has enabled process events in Auditbeat).

process.name
process.pid
process.entity_id
user.id
user.name
container.id

pkoutsovasilis · 2023-10-18T11:57:09Z

Hello also from my end 👋 Thank you for your messages and initial investigation @chemamartinez , @norrietaylor , and @andrewkroh.

Having your comments in mind I have been investigating, in theory so far, how a kprobe-based solution could look like to match the existing inotify-based Event Triggers mentioned above by @andrewkroh . Yesterday, I had a meeting with @nicholasberlin , @stanek-michal and @norrietaylor where we discussed my early thinking and we all deemed that this could be feasible. In a few words, I am thinking of scanning the directories that need to be monitored initially, extract the inode number of each one, and then through the appropriate kprobe-based events associate which ones of them affect files/folders that we want to monitor.

Of course there are certain open questions, as to how many of this kprobe-based events can we process and associate in a timely manner, portability of kprobe-based solution, etc. For that reason, I will start coding a PoC that builds on top of the theory and as a result we will be able to have more quantitative answers to such questions.

The thinking so far is to make a separate backend for the FIM-module which is kprobe-based and is able to produce events that consist also the pid that caused each change. Then a processor can utilise the former to enrich the events accordingly before sending them out.

norrietaylor · 2024-05-15T15:43:56Z

@jamiehynds, perhaps we should discuss acceptance criteria and GA plan for this ticket?

narph added release-pending Team:Security-External Integrations labels Aug 16, 2023

narph mentioned this issue Aug 16, 2023

[Meta]File Integrity Monitoring | User Information #3310

Open

narph added 8.11 candidate Integration:FIM File integrity monitor labels Aug 16, 2023

narph assigned chemamartinez Aug 16, 2023

jamiehynds added the Epic label Aug 30, 2023

norrietaylor assigned mmat11 Oct 6, 2023

nick-alayil added Team: Linux Platform 8.12 candidate labels Oct 16, 2023

mmat11 mentioned this issue Oct 24, 2023

EventProbe: capture file info from inode elastic/ebpf#178

Merged

mmat11 mentioned this issue Dec 4, 2023

[auditbeat] fim: implement ebpf backend elastic/beats#37223

Merged

9 tasks

Tacklebox mentioned this issue Dec 11, 2023

File Integrity Monitoring | User Information - Linux | KProbe Backend #8692

Closed

MikePaquette added 8.13 candidate temp labels Dec 20, 2023

Tacklebox mentioned this issue Dec 28, 2023

Add backend configuration key to fim integration #8807

Merged

4 tasks

andrewkroh mentioned this issue Jan 3, 2024

[Integration: FIM] No process information in FIM Events from Linux systems #4826

Closed

pkoutsovasilis mentioned this issue Jan 10, 2024

Move (re-license) tracing package and introduce 'allowundefined' in kprobe struct tag elastic/beats#37602

Merged

4 tasks

narph added Team:Security-Linux Platform Linux Platform Security team and removed Team:Security-External Integrations labels Jan 25, 2024

pkoutsovasilis mentioned this issue Jan 31, 2024

[auditbeat] fim: implement kprobes backend elastic/beats#37796

Merged

7 tasks

dhru42 added the 8.14 candidate label Feb 12, 2024

mmat11 mentioned this issue Mar 6, 2024

[Auditbeat] fim(ebpf): enrich file events with process data elastic/beats#38199

Merged

7 tasks

mmat11 mentioned this issue Mar 14, 2024

[Auditbeat] fim(ebpf): enrich file events with container id elastic/beats#38328

Merged

7 tasks

pkoutsovasilis mentioned this issue Apr 3, 2024

[Auditbeat] fim(kprobes): enrich file events by coupling add_process_metadata processor elastic/beats#38716

Closed

7 tasks

This was referenced Apr 5, 2024

[8.13](backport #38199) [Auditbeat] fim(ebpf): enrich file events with process data elastic/beats#38742

Merged

[8.13](backport #38328) [Auditbeat] fim(ebpf): enrich file events with container id elastic/beats#38775

Merged

pkoutsovasilis mentioned this issue Apr 8, 2024

[Auditbeat] fim(kprobes): enrich file events by coupling add_process_metadata processor elastic/beats#38776

Merged

7 tasks

mergify bot mentioned this issue Apr 13, 2024

[8.13](backport #38776) [Auditbeat] fim(kprobes): enrich file events by coupling add_process_metadata processor elastic/beats#38916

Closed

7 tasks

norrietaylor closed this as completed May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File Integrity Monitoring | User Information - Linux #7401

File Integrity Monitoring | User Information - Linux #7401

narph commented Aug 16, 2023

elasticmachine commented Aug 16, 2023

chemamartinez commented Sep 6, 2023

norrietaylor commented Sep 7, 2023 •

edited

andrewkroh commented Oct 11, 2023 •

edited

pkoutsovasilis commented Oct 18, 2023

norrietaylor commented May 15, 2024

File Integrity Monitoring | User Information - Linux #7401

File Integrity Monitoring | User Information - Linux #7401

Comments

narph commented Aug 16, 2023

elasticmachine commented Aug 16, 2023

chemamartinez commented Sep 6, 2023

Fsnotify

Fanotify

Auditd

eBPF

Kprobes

norrietaylor commented Sep 7, 2023 • edited

Fanotify

eBPF

Kprobes

Recommendation

andrewkroh commented Oct 11, 2023 • edited

Event Triggers

Data

pkoutsovasilis commented Oct 18, 2023

norrietaylor commented May 15, 2024

norrietaylor commented Sep 7, 2023 •

edited

andrewkroh commented Oct 11, 2023 •

edited