Feature #13736
openIdentify Problematic Puppet Modules
Description
Would it be possible to add a page where problematic hosts are itemized by which puppet modules are throwing the errors/notices? This would greatly help us figure out if a particular puppet module needs to be fixed rather than the current approach of taking it host-by-host and just gradually building an awareness that a puppet module seems to throw a lot of errors. It would also help us triage in our current situation where there are a lot of errors being thrown by various puppet modules. It would be helpful to figure out which modules are the "most" broken and what errors are the most common so we can fix the biggest problems and move onto the smaller ones.
Updated by Marek Hulán over 7 years ago
This might be hard to do, our reports are not matched to specific modules/manifest. We could at least though do some statistics based on messages count, e.g.
Message.joins(:logs).select("messages.*, COUNT (logs.*)").group(:'messages.id').order(:count => :desc).where("level_id >= ?", Log::LEVELS.index(:err)).limit(10)
would render following query
2016-10-18T16:55:02 [sql] [D] Message Load (4.1ms) SELECT messages.*, COUNT (logs.*) FROM "messages" INNER JOIN "logs" ON "logs"."message_id" = "messages"."id" WHERE (level_id >= 4) GROUP BY "messages"."id" ORDER BY "messages"."count" DESC LIMIT 10
Maybe a widget displaying 10 most frequent error messages would solve this?