Project

General

Profile

Actions

Bug #4826

closed

Investigate if we can speed up reposet enablement in the Red Hat repositories page

Added by Mike McCune about 10 years ago. Updated almost 6 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
Web UI
Target version:
Difficulty:
medium
Triaged:
Yes
Fixed in Releases:
Found in Releases:

Description

0) Make sure you have a manifest imported

1) Navigate to the Content -> Red Hat Repositories page

2) If you select a group of repositories, say:

  • Red Hat Enterprise Linux 6 Server (RPMs)

it can take upwards of 1-2 minutes for the subset of only around 10-15 repos to show up

Would like to see if we can spend some time looking at the code that detects the set of repos underneath that list to see if we can speed it up.

Actions #1

Updated by Mike McCune about 10 years ago

  • Description updated (diff)
Actions #2

Updated by Ivan Necas about 10 years ago

I'm working on the Dynflowization of repo set enablement: the parallel execution of repo creations might probably help + we can try to identify the slow bits and address that ad-hoc.

Actions #3

Updated by Ivan Necas about 10 years ago

  • Assignee changed from Justin Sherrill to Ivan Necas
Actions #4

Updated by Ivan Necas about 10 years ago

Dynflowized repo enable for reposet enable with 12 repos took 12 seconds against mirrored cdn, 25s against real one.

There are two slow things here:

1. loading the listing files from cdn (don't know yet how long it takes, will have more data tomorrow)
2. creating the repositories: one single repo is created in about 2 seconds. However, when we run the
creating of 12 repos in parallel, the last one takes about 8 seconds to create becuase waiting in a
queue for a while for repo publish.

With the second step, we could:

a. skip the repo publishing and postpone it for syncing, as without syncing, the repo is not that much useful.
b. increase the pulp concurrency options to be able to run more distribution publish in parallel

The a. option could also help with http://projects.theforeman.org/issues/4724

Actions #5

Updated by Ivan Necas about 10 years ago

I've extracted the reposet enable code into more dynflow actions to have more numbers in hand:

1. load initial data to be able to start scanning: 2s
2. the scanning of cnd itself: 3.5s
3. create 12 repos locally in db: 16 s: majority of the time, it was the repository.save itself
4. create 12 repos in pulp in parallel (7 s total, in scale of 3-7s: most of the time it was the distributor publish stuff)

This gives us around 30s for repo set enable.

This is already boost comparing the old style, where the repo creation (both local db and pulp orchestration) was run
in sequence, which gives us more than 60s for the task.

Possible optimization are:

1. run even the local repo creation in parallel (would give us around - 10s)
2. disable the metadata genration after repo is created (as we don't need that until some content appears there anyway) (would give us around -8s)
3. postpone the repo creation to the repo-enable phase (would give us - -25s) it would be just about loading the listing files from cdn, which takes around 5 seconds. Later optimization could be

My favorite is the option 3. and now is good time to do this change, we need to add the support for repo-enable in new cli so we will touch the code anyway.

Actions #6

Updated by Ivan Necas about 10 years ago

Having just the cdn scan for reposet enable checkbox takes us to something around 8 seconds, which is not that bad.

Actions #7

Updated by Ivan Necas about 10 years ago

  • Status changed from Assigned to Ready For Testing
Actions #8

Updated by Ivan Necas about 10 years ago

  • Status changed from Ready For Testing to Closed
  • % Done changed from 0 to 100

Applied in changeset katello|commit:f0581b72a63bd9c4aca36f62c59523097fa338b9.

Actions #9

Updated by Eric Helms over 9 years ago

  • translation missing: en.field_release set to 13
Actions

Also available in: Atom PDF