An error in docker: failed to create shim task

When I try to run this command:

docker start e16fc114c32f

I get this error response from daemon:

failed to create task for container: 
failed to create shim task: 
OCI runtime create failed: 
runc create failed: unable to start container process: unable to apply cgroup configuration: 
unable to start unit "docker-e16fc114c32f6c9154ae3c2c0bb8ade6298ff61eaf61529b58bd2939eefd7f7c.scope" (properties [{Name:Description Value:
"libcontainer container e16fc114c32f6c9154ae3c2c0bb8ade6298ff61eaf61529b58bd2939eefd7f7c"} {Name:Slice Value:"system.slice"} 
{Name:Delegate Value:true} 
{Name:PIDs Value:@au [303179]} 
{Name:MemoryAccounting Value:true} 
{Name:CPUAccounting Value:true} 
{Name:IOAccounting Value:true} 
{Name:TasksAccounting Value:true} 
{Name:DefaultDependencies Value:false}]): 
Unit docker-e16fc114c32f6c9154ae3c2c0bb8ade6298ff61eaf61529b58bd2939eefd7f7c.scope was already loaded or has a fragment file.: unknown

There is no solution from Docker Chat/AI. I can delete the container and create it again, but after some time, I got the error again. Although it randomly can start normal, too.

I don’t use Docker locally, so I’m not very familiar with it, but could it be a leftover process from a previous runtime?


Docker error: failed to create shim task / docker-<container-id>.scope was already loaded or has a fragment file

When running:

docker start e16fc114c32f

Docker fails with:

failed to create task for container:
failed to create shim task:
OCI runtime create failed:
runc create failed: unable to start container process:
unable to apply cgroup configuration:
unable to start unit "docker-e16fc114c32f6c9154ae3c2c0bb8ade6298ff61eaf61529b58bd2939eefd7f7c.scope"
...
Unit docker-e16fc114c32f6c9154ae3c2c0bb8ade6298ff61eaf61529b58bd2939eefd7f7c.scope was already loaded or has a fragment file.

Short diagnosis

This does not look like a normal Dockerfile or application-code problem.

The important part is not just:

failed to create shim task

That phrase is broad and can appear for many runtime failures.

The important part is this:

unable to apply cgroup configuration:
unable to start unit "docker-<full-container-id>.scope"
Unit docker-<full-container-id>.scope was already loaded or has a fragment file.

That points to a host-side Docker/containerd/runc/systemd/cgroup problem.

Most likely:

A previous failed start, crash, OOM kill, restart loop, or runtime cleanup race left a stale or failed systemd scope unit for this exact container ID. When Docker tries to start the same container again, systemd refuses to create/start the same docker-<full-container-id>.scope unit.

That also explains why deleting and recreating the container helps temporarily: the recreated container gets a new full container ID, so Docker uses a new systemd scope name. The old stuck unit name is avoided, but the underlying trigger may still exist.


Background: what docker-<id>.scope means

Docker containers are tracked and limited using Linux cgroups. On systems where Docker uses the systemd cgroup driver, Docker containers appear as systemd scope units such as:

system.slice/docker-<full-container-id>.scope

Docker’s own runtime metrics documentation shows paths like:

/sys/fs/cgroup/system.slice/docker-<longid>.scope/

for cgroup v2 with the systemd driver:

The Docker daemon reference also explains that with systemd cgroups, cgroup parents are systemd slice names:

Your error shows systemd properties such as:

Slice=system.slice
Delegate=true
MemoryAccounting=true
CPUAccounting=true
IOAccounting=true
TasksAccounting=true
DefaultDependencies=false

Those are not application-level settings. They are systemd/cgroup properties used while Docker/runc prepares the container’s resource-control scope.

runc also documents this systemd behavior. When using systemd cgroups, runc creates a transient systemd scope and sets properties such as Delegate, Slice, CPUAccounting, MemoryAccounting, IOAccounting, and TasksAccounting:

So the failure path is approximately:

docker start e16fc114c32f
  -> Docker daemon
    -> containerd
      -> containerd-shim
        -> runc
          -> systemd/cgroup setup
            -> create/start docker-<full-container-id>.scope
              -> systemd says that unit is already loaded or has a fragment file

This means Docker fails before the containerized application meaningfully starts.


Why this is probably not a Dockerfile problem

A Dockerfile or entrypoint problem usually looks like:

exec: "foo": executable file not found in $PATH
permission denied
no such file or directory
ModuleNotFoundError
address already in use

Your error is different:

unable to apply cgroup configuration
unable to start unit docker-<full-id>.scope

That is lower-level. It happens while Docker/runc is setting up the host-side cgroup/systemd scope.

The application can still be an indirect trigger if it causes:

  • OOM kills;
  • fast crash loops;
  • repeated restarts;
  • too many child processes;
  • excessive memory/thread usage;
  • host resource exhaustion.

But the direct failure is systemd refusing the docker-<id>.scope unit.


Most likely causes, ranked

1. Stale or failed docker-<full-id>.scope unit

This is the strongest hypothesis because the error literally says:

Unit docker-<full-id>.scope was already loaded or has a fragment file.

That means systemd still knows about the unit name Docker is trying to create.

Possible states:

  • the unit is failed;
  • the unit is still loaded;
  • a transient unit fragment still exists;
  • systemd still has bookkeeping for the unit;
  • cleanup after a previous failure did not complete;
  • a PID is still associated with that scope.

There is a similar historical Docker/Moby issue where Docker failed with:

Unit docker-<id>.scope already exists.

The issue involved a systemd cgroup scope getting left behind and preventing the container from starting:

Those are old issues, so they are not proof of the exact same current bug. But they show that this failure mode exists: a Docker/systemd scope can be left behind and later collide with a new start of the same container ID.


2. OOM kill or memory pressure

This is the next thing I would check.

Docker’s resource-constraint documentation explains that when the Linux kernel detects an out-of-memory condition, it can kill container processes. Docker also warns against disabling the OOM killer without memory limits because the host itself can become unstable:

A very similar OCI-runtime case exists in crun/Podman. After a container hit OOM, a failed systemd libpod-<id>.scope remained, and later starts failed with:

Unit libpod-<id>.scope was already loaded or has a fragment file.

Relevant issue:

That is not Docker, but the mechanism is similar:

container runtime
  -> systemd scope
    -> OOM or hard kill
      -> failed/stale scope remains
        -> restart fails with "scope was already loaded or has a fragment file"

So even though your visible error mentions systemd, the original trigger may still be memory exhaustion.


3. Crash/restart loop

A restart loop can amplify this problem.

Example:

container starts
  -> application exits immediately
    -> Docker restart policy starts it again
      -> application exits again
        -> cleanup/start race eventually leaves scope state behind

Check for restart policies such as:

always
unless-stopped
on-failure

The old Moby stale-scope issue involved a container dying on startup while something restarted it repeatedly:

If your container has a high restart count, the stale scope may be a symptom of the repeated lifecycle churn.


4. systemd / D-Bus / cgroup runtime instability

The fact that it “randomly can start normal too” matters.

A deterministic Dockerfile problem usually fails consistently. Intermittent success suggests timing, cleanup, systemd, D-Bus, cgroup, or runtime state.

Similar reports exist where Docker failed at the same layer:

failed to create shim task
unable to apply cgroup configuration
unable to start unit "docker-...scope"

Examples:

These are not all Docker, but they support the same general pattern: rapid or repeated container lifecycle operations can expose systemd/cgroup unit-creation races.


5. Rootless Docker issue

Your pasted error says:

Slice Value:"system.slice"

That makes rootless Docker less likely. Rootless Docker more often involves user.slice paths.

Still, check it:

docker info | grep -i rootless

If rootless mode is enabled, use systemctl --user for some checks.

Docker rootless docs:


6. Old or mismatched Docker/containerd/runc/systemd/kernel versions

This stack has several moving parts:

Linux kernel
systemd
Docker daemon
containerd
containerd-shim
runc
cgroup v1/v2

Docker’s cgroup v2 documentation notes version requirements, including containerd v1.4+, runc v1.0.0-rc91+, and kernel v4.15+ with v5.2+ recommended:

If Docker, containerd, runc, systemd, or the kernel are old or mixed from unusual package sources, update them through the OS-supported path.


Immediate recovery

Step 1: Get the full container ID and exact systemd unit

CID=e16fc114c32f
FULL_ID=$(docker inspect -f '{{.Id}}' "$CID")
UNIT="docker-${FULL_ID}.scope"

echo "CID=$CID"
echo "FULL_ID=$FULL_ID"
echo "UNIT=$UNIT"

Expected shape:

UNIT=docker-e16fc114c32f6c9154ae3c2c0bb8ade6298ff61eaf61529b58bd2939eefd7f7c.scope

That should match the unit in the error.

For a generic copy/paste version:

CID=<CID>
FULL_ID=$(docker inspect -f '{{.Id}}' "$CID")
UNIT="docker-${FULL_ID}.scope"

echo "CID=$CID"
echo "FULL_ID=$FULL_ID"
echo "UNIT=$UNIT"

Step 2: Inspect Docker’s view of the container

docker inspect "$CID" \
  --format 'Status={{.State.Status}} ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}} Restarting={{.State.Restarting}} RestartCount={{.RestartCount}}'

Also check restart policy and memory settings:

docker inspect "$CID" \
  --format 'RestartPolicy={{json .HostConfig.RestartPolicy}} Memory={{.HostConfig.Memory}} MemorySwap={{.HostConfig.MemorySwap}} OomKillDisable={{.HostConfig.OomKillDisable}}'

Interpretation:

Output Meaning
OOMKilled=true Memory pressure is probably a root trigger.
ExitCode=137 Often SIGKILL; commonly OOM, though not always.
RestartCount high Restart loop likely.
Restarting=true Docker may already be retrying.
Memory=0 No hard memory limit configured.
RestartPolicy={"Name":"always"...} Automatic restarts may amplify the issue.

Step 3: Inspect the systemd scope

sudo systemctl status "$UNIT" --no-pager --full || true

sudo systemctl show "$UNIT" \
  -p Id -p LoadState -p ActiveState -p SubState -p FragmentPath -p ControlGroup \
  --no-pager || true

sudo systemctl list-units --all --type=scope | grep "$FULL_ID" || true
sudo systemctl --failed | grep "$FULL_ID" || true

Look for:

Observation Likely meaning
ActiveState=failed Failed stale scope; reset may help.
LoadState=loaded systemd still knows the unit.
FragmentPath=/run/systemd/... transient unit fragment may still exist.
ActiveState=active some process may still be in that scope; do not kill blindly.
No unit appears, but Docker still fails Docker/containerd may hold stale runtime state.

Also inspect the PID shown in the original error:

ps -fp 303179 || true
cat /proc/303179/cgroup 2>/dev/null || true
sudo systemctl status 303179 --no-pager --full 2>/dev/null || true

Step 4: Reset the failed scope

Try the least disruptive fix first:

sudo systemctl reset-failed "$UNIT"
sudo systemctl daemon-reload

docker start "$CID"

systemctl reset-failed resets the failed state of specified units and also resets some per-unit counters such as start-rate-limit counters and service restart counters:

If this works, the immediate blocker was the failed/stale systemd scope.

Important: this does not prove the underlying cause is fixed. It only clears the stuck systemd state.


Step 5: If reset alone does not work, stop the unit carefully

First inspect it:

sudo systemctl status "$UNIT" --no-pager --full || true

If it is clearly stale and not running useful container processes:

sudo systemctl stop "$UNIT" || true
sudo systemctl reset-failed "$UNIT"
sudo systemctl daemon-reload

docker start "$CID"

Do not start by manually deleting cgroup directories. Let systemd and Docker clean their own state first.


Step 6: Restart containerd and Docker

If the targeted systemd reset does not work:

sudo systemctl restart containerd
sudo systemctl restart docker

docker start "$CID"

This can affect other running containers.

A more deliberate sequence:

sudo systemctl stop docker
sudo systemctl stop containerd

sudo systemctl daemon-reload
sudo systemctl reset-failed

sudo systemctl start containerd
sudo systemctl start docker

docker start "$CID"

Docker daemon logs on systemd hosts can be viewed with journalctl:

sudo journalctl -u docker.service -u containerd.service -b --no-pager | tail -300

Root-cause investigation

1. Check for OOM

docker inspect "$CID" \
  --format 'OOMKilled={{.State.OOMKilled}} ExitCode={{.State.ExitCode}} Error={{.State.Error}} RestartCount={{.RestartCount}}'

Kernel logs:

sudo journalctl -k -b --no-pager \
  | grep -i -E 'oom|out of memory|killed process|memory cgroup' || true

Previous boot:

sudo journalctl -k -b -1 --no-pager \
  | grep -i -E 'oom|out of memory|killed process|memory cgroup' || true

If you see lines like:

Out of memory: Killed process ...
Memory cgroup out of memory
oom-kill
Killed process ... python
Killed process ... node
Killed process ... java

then the systemd scope failure may be a secondary symptom. The real trigger may be memory exhaustion.

If OOM is confirmed, use memory limits deliberately:

docker run \
  --memory=8g \
  --memory-reservation=6g \
  --memory-swap=10g \
  ...

For Compose:

services:
  app:
    image: <image>:<tag>
    mem_limit: 8g
    memswap_limit: 10g
    restart: unless-stopped

Be careful with --memory-swap; Docker documents non-obvious semantics for it:

Do not blindly use:

--oom-kill-disable

unless you also set a memory limit and understand the consequences. Docker warns that disabling the OOM killer without memory limits can destabilize the host.


2. Check for restart loops

docker inspect "$CID" \
  --format 'RestartPolicy={{json .HostConfig.RestartPolicy}} RestartCount={{.RestartCount}}'

Logs:

docker logs --tail=300 "$CID"

If restart count is high, temporarily disable restart while debugging:

docker update --restart=no "$CID"

Then start manually:

docker start "$CID"
docker logs -f "$CID"

This separates two problems:

Problem A: the application crashes or OOMs.
Problem B: Docker/systemd cleanup gets stuck afterward.

Fix Problem A if it keeps triggering Problem B.


3. Test whether the problem is container-specific or host-wide

docker run --rm hello-world
docker run --rm alpine true

Interpretation:

Result Meaning
hello-world works, affected container fails More likely stale scope tied to this container, app OOM, or app restart loop.
hello-world also fails with cgroup/scope error More likely host-wide Docker/containerd/runc/systemd/cgroup issue.
hello-world sometimes works, sometimes fails Intermittent runtime/systemd/D-Bus/cgroup instability.

This is one of the fastest ways to avoid debugging the wrong layer.


4. Check runtime and OS versions

docker version
docker info
containerd --version
runc --version
systemctl --version
uname -a
cat /etc/os-release
cat /etc/docker/daemon.json 2>/dev/null || true

Pay attention to:

Cgroup Driver
Cgroup Version
Runtimes
Default Runtime
Rootless
Kernel Version
Operating System
Docker package source

If packages are old or mixed, update Docker/containerd/runc/systemd/kernel using the OS-supported method and reboot.

On Ubuntu/Debian with Docker’s official packages, this is commonly:

sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo reboot

If using distro packages such as docker.io, mention that when reporting the issue, because Docker Engine, containerd, and runc versions may differ from Docker’s official packaging.


5. Check rootless Docker

docker info | grep -i rootless

If rootless mode is enabled:

systemctl --user status docker --no-pager || true
systemctl --user list-units --all --type=scope | grep docker || true
systemctl --user --failed | grep docker || true

For rootless Docker:

systemctl --user reset-failed 'docker-*.scope'
systemctl --user daemon-reload
systemctl --user restart docker

For normal rootful Docker:

sudo systemctl reset-failed 'docker-*.scope'
sudo systemctl daemon-reload
sudo systemctl restart containerd docker

Why deleting/recreating the container helps temporarily

Deleting and recreating the container changes the full container ID.

Old container:

e16fc114c32f6c9154ae3c2c0bb8ade6298ff61eaf61529b58bd2939eefd7f7c

Old scope:

docker-e16fc114c32f6c9154ae3c2c0bb8ade6298ff61eaf61529b58bd2939eefd7f7c.scope

New container:

some-new-full-container-id

New scope:

docker-some-new-full-container-id.scope

So recreation avoids the stuck unit name.

But if the trigger is OOM, restart loop, runtime cleanup race, or host instability, the same class of failure can return later.


My suggested command sequence

A. Identify, inspect, reset

CID=e16fc114c32f
FULL_ID=$(docker inspect -f '{{.Id}}' "$CID")
UNIT="docker-${FULL_ID}.scope"

echo "$UNIT"

docker inspect "$CID" \
  --format 'Status={{.State.Status}} ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}} Restarting={{.State.Restarting}} RestartCount={{.RestartCount}}'

docker inspect "$CID" \
  --format 'RestartPolicy={{json .HostConfig.RestartPolicy}} Memory={{.HostConfig.Memory}} MemorySwap={{.HostConfig.MemorySwap}} OomKillDisable={{.HostConfig.OomKillDisable}}'

sudo systemctl status "$UNIT" --no-pager --full || true

sudo systemctl show "$UNIT" \
  -p Id -p LoadState -p ActiveState -p SubState -p FragmentPath -p ControlGroup \
  --no-pager || true

sudo systemctl list-units --all --type=scope | grep "$FULL_ID" || true
sudo systemctl --failed | grep "$FULL_ID" || true

sudo systemctl reset-failed "$UNIT"
sudo systemctl daemon-reload

docker start "$CID"

B. If that fails

sudo systemctl stop "$UNIT" || true
sudo systemctl reset-failed "$UNIT"
sudo systemctl daemon-reload

docker start "$CID"

C. If that still fails

sudo systemctl restart containerd
sudo systemctl restart docker

docker start "$CID"

D. If it keeps recurring

sudo journalctl -k -b --no-pager \
  | grep -i -E 'oom|out of memory|killed process|memory cgroup' || true

docker inspect "$CID" \
  --format 'RestartPolicy={{json .HostConfig.RestartPolicy}} RestartCount={{.RestartCount}} OOMKilled={{.State.OOMKilled}} ExitCode={{.State.ExitCode}}'

docker logs --tail=300 "$CID"

docker info | grep -i -E 'Cgroup|Runtime|Rootless|containerd|runc|Kernel|Operating System'

What not to do first

Do not manually delete cgroup directories first

Avoid starting with:

sudo rm -rf /sys/fs/cgroup/system.slice/docker-*.scope

The cgroup tree is kernel/systemd-managed state. Manual deletion can make diagnosis worse.

Prefer:

systemctl status
systemctl stop
systemctl reset-failed
systemctl daemon-reload
systemctl restart docker/containerd

Do not blindly kill containerd-shim processes

Avoid:

sudo kill -9 <pid>

unless you have mapped that PID to the affected container and understand the impact.


Do not disable the OOM killer as a shortcut

Avoid:

--oom-kill-disable

unless you also set memory limits and understand the consequences.


Do not treat docker rm && docker run as a real fix

It is a workaround because it changes the container ID and scope name. It does not explain why the scope got stuck.


Minimal diagnostic bundle

If asking Docker, your distro, an infrastructure admin, or a hosting provider, collect this:

CID=e16fc114c32f
FULL_ID=$(docker inspect -f '{{.Id}}' "$CID" 2>/dev/null || true)
UNIT="docker-${FULL_ID}.scope"

{
  echo '### date'
  date -Is

  echo '### docker version'
  docker version || true

  echo '### docker info'
  docker info || true

  echo '### containerd version'
  containerd --version || true

  echo '### runc version'
  runc --version || true

  echo '### systemd version'
  systemctl --version || true

  echo '### kernel / os'
  uname -a || true
  cat /etc/os-release || true

  echo '### container state'
  docker inspect "$CID" --format '{{json .State}}' || true

  echo '### restart and memory config'
  docker inspect "$CID" \
    --format 'RestartPolicy={{json .HostConfig.RestartPolicy}} Memory={{.HostConfig.Memory}} MemorySwap={{.HostConfig.MemorySwap}} OomKillDisable={{.HostConfig.OomKillDisable}}' || true

  echo '### target unit'
  echo "$UNIT"
  sudo systemctl status "$UNIT" --no-pager --full || true
  sudo systemctl show "$UNIT" \
    -p Id -p LoadState -p ActiveState -p SubState -p FragmentPath -p ControlGroup \
    --no-pager || true

  echo '### docker scopes'
  sudo systemctl list-units --all --type=scope | grep docker || true

  echo '### failed units'
  sudo systemctl --failed || true

  echo '### docker/containerd logs'
  sudo journalctl -u docker.service -u containerd.service -b --no-pager | tail -500 || true

  echo '### kernel OOM logs'
  sudo journalctl -k -b --no-pager \
    | grep -i -E 'oom|out of memory|killed process|memory cgroup' || true
} > docker-scope-debug.txt

Review docker-scope-debug.txt before sharing it. It may contain private paths, image names, hostnames, or internal service names.


Useful references

Official docs:

Similar issues and useful cases:


Bottom line

For this exact error, I would treat the immediate cause as:

systemd still has docker-e16fc114c32f6c9154ae3c2c0bb8ade6298ff61eaf61529b58bd2939eefd7f7c.scope
loaded, failed, active, or represented by a transient fragment.

Most likely triggers:

  1. stale/failed systemd scope after a previous failed start;
  2. OOM kill or memory pressure;
  3. crash/restart loop;
  4. Docker/containerd/runc/systemd cleanup race;
  5. host runtime version or cgroup v2 issue;
  6. rootless Docker only if docker info says rootless.

Best first recovery:

CID=e16fc114c32f
FULL_ID=$(docker inspect -f '{{.Id}}' "$CID")
UNIT="docker-${FULL_ID}.scope"

sudo systemctl reset-failed "$UNIT"
sudo systemctl daemon-reload
docker start "$CID"

Then immediately check the trigger:

sudo journalctl -k -b --no-pager \
  | grep -i -E 'oom|out of memory|killed process|memory cgroup' || true

docker inspect "$CID" \
  --format 'RestartPolicy={{json .HostConfig.RestartPolicy}} RestartCount={{.RestartCount}} OOMKilled={{.State.OOMKilled}} ExitCode={{.State.ExitCode}}'

docker info | grep -i -E 'Cgroup|Runtime|Rootless|containerd|runc|Kernel|Operating System'

Deleting and recreating the container is only a workaround. It changes the container ID and therefore the docker-<id>.scope name. If the underlying trigger remains, the error can return.

It sounds like the issue may be related to Docker cgroup or systemd conflicts rather than the container itself, especially since the container sometimes starts normally and sometimes fails randomly. Recreating the container can temporarily help, but checking Docker version compatibility, restarting the Docker service, and cleaning old container scope processes might also be worth trying.