Project

General

Profile

Actions

Bug #17520

open

Puppet consumes excessive amounts of CPU and memory when importing facts from hosts with many NIC's (puppet_fact_parser.rb / Solaris)

Added by Noh Wayh over 7 years ago. Updated over 7 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Importers
Target version:
-
Difficulty:
medium
Triaged:
Fixed in Releases:
Found in Releases:

Description

It seems that the fact import from hosts with many NICs (Solaris) causes the foreman processes to run amoc with CPU and RAM - all passenger instances consume 100% CPU and just grow in memory until the OS starts swapping. As a consequence foreman performs worse and gets even slower when importing and so on..
My suspicion is that it is the puppet_fact_parser.rb that might be at fault.
Not sure if the CPU consumption is triggered for a single host with many NIC's or when many hosts with many nic's gets their facts imported in a short timeinterval.

I will attach a facter json output from a host with many nics (there are hosts with 254 ip's, so fact output might be modified to match).

Additional output - Passenger under httpd without any special tweaking done:

passenger-memory-stats

...
...

------ Passenger processes -------
PID VMSize Private Name
----------------------------------
33114 2078.3 MB 837.3 MB Passenger RackApp: /usr/share/foreman
37420 1564.7 MB 798.4 MB Passenger RackApp: /usr/share/foreman
38605 6878.9 MB 4012.7 MB Passenger RackApp: /usr/share/foreman
41681 6045.9 MB 3801.0 MB Passenger RackApp: /usr/share/foreman
44608 5917.8 MB 3953.4 MB Passenger RackApp: /usr/share/foreman
52642 209.8 MB 0.0 MB PassengerWatchdog
52645 1338.9 MB 1.3 MB PassengerHelperAgent
52651 214.2 MB 0.0 MB PassengerLoggingAgent
57032 1116.3 MB 437.1 MB Passenger RackApp: /usr/share/foreman
  1. Processes: 9
  2. Total private dirty RSS: 13841.33 MB

Files

facter-solaris10-manynics.json facter-solaris10-manynics.json 33.6 KB Solaris 10 facter json output (many NICS) Noh Wayh, 11/29/2016 10:31 AM
solaris10.json solaris10.json 160 KB solaris host facts with 970 interfaces Noh Wayh, 01/26/2017 07:55 AM
solaris10-2.json solaris10-2.json 158 KB solaris host facts with 970 interfaces Noh Wayh, 01/26/2017 07:55 AM
Actions #1

Updated by Dominic Cleal over 7 years ago

The production.log, preferably with SQL logging would be useful too please (https://theforeman.org/manuals/1.13/index.html#7.2Debugging) to help determine which part of the import is slow.

Actions #3

Updated by Noh Wayh over 7 years ago

Dominic Cleal wrote:

The production.log, preferably with SQL logging would be useful too please (https://theforeman.org/manuals/1.13/index.html#7.2Debugging) to help determine which part of the import is slow.

Do you have any tool for anonymising the production.log file from ip-adresses/hostnames in order to help out better?

Actions #4

Updated by Noh Wayh over 7 years ago

Edit:
Have a few hosts with ~970 virtual nics named e1000g0_1 ..... e1000g0_970 etc.. resulting in a couple of thousand facts needing to be looped through etc.
This might screw things up memorywise/cpuwise I presume, I believe using attached facts-file and modifying it appropriately might give some more info.

Also, all memory is now consumed on host and swap is heavily utilized - intermittently dropping when foreman processes are respawned.

Actions #5

Updated by Marek Hulán over 7 years ago

In environments like this I think it would be better to disable NICs parsing. Our UI wouldn't be probably usable with 1000 nics anyway. You can find this option at Administer -> Settings -> Provisioning -> Ignore Puppet facts for provisioning (set it to Yes). While this is not a fix, hopefully it's a workaround for you.

Updated by Noh Wayh over 7 years ago

Dumped the foreman-instances memory which were consuming cycles and memory and observed that indeed the network-facts from the suspected hosts were everywhere.

The culprits were removed from foreman, with a subsequent restart of foreman. System Load dropped from 6 to 2 - memory usage went down to healthy levels and no SWAP observed.

Attaching culprits' facts (excluding any custom facts).

If you can use it to test and reproduce problem it would be great (try running the import of facts often to load the foreman instance).
Unfortunately I can not add production.log due to it's massive size and non-anonymized data.

Actions

Also available in: Atom PDF